Thinking about the Law of Quadratic Reciprocity

Mathematics, the way it is currently written, can be difficult to read. Sometimes it helps to see how people think about a topic or theorem before (or after, or during) the reading of a proper treatment or rigorous proof. The purpose of this post is to provide such a view regarding the proof of the famous law of quadratic reciprocity. There are many details missing, on purpose, and the hope is that it reads like a good story that’s both interesting, believable and easily verifiable.

Basically this “law” provides a fantastic theoretical tool used to tell us which numbers have square roots mod p. We will not get into the full extent of its power. Rather, the goal here is to (eventually) state it and to shine light on remarkable features of the proof. We start with -1.

When is -1 a square?

As example, $\sqrt{-1}$  makes sense mod 5, since

$2^2 = 4= -1 \mod 5$.

More generally, it turns out that for any odd prime p,

$x^2=-1 \mod p$  has a solution if an only if

$p=1 \mod 4$

The reason boils down to Euler’s Criterion (below). Let’s first warm up with Fermat’s Little Theorem (FLT), which states:

$x^{p-1}=1 \mod p \hbox{ if } x \ne 0$

FLT is true because $x, 2x, 3x, ...\ , (p-1)x$  are all different mod p, and so their product is $(p-1)!$

Canceling the factor $(p-1)!$ on both sides gives $x^{p-1}=1 \mod p$. The upshot is that the equation:

$x^{p-1} -1=0 \mod p$

has p-1 solutions (the maximum it could have, considering its degree).

Now, consider the similar equation:

$x^{\frac{p-1}{2}}=1 \mod p$.

$\hbox{If }x_0=y^2 \hbox{ (and } y\ne 0 \hbox{), then } x_0 \hbox{ is a solution, by FLT.}$

Since there are $\frac{p-1}{2}$ different (nonzero) squares mod p (prove it) and since the equation has at most $\frac{p-1}{2}$ solutions (believe it for now–think degree again, working modulo a prime), the solutions of the equation are exactly the squares. This is called Euler’s criterion and often written as:

${\bf \emph{x}}$ is a (nonzero) square mod p if and only if ${\bf \emph{x}^\frac{p-1}{2}=1}$

(By the way, if $x^\frac{p-1}{2}\ne 1$  then  $x^\frac{p-1}{2}=-1 \mod p$ — why?)

We can now understand when -1 can be a perfect square mod p. In general, $(-1)^m$ can only equal 1 if $m$ is even. Thus, $(-1)^\frac{p-1}{2}=1$  if and only if  $\frac{p-1}{2}$  is even; that is, $p=1 \mod 4.$

When does $\sqrt 2$ exist?

This is actually similar but certainly trickier. The answer, for an odd prime p is:

$x^2=2$ has a solution (mod p) if and only if $p=1,7 \mod 8$.

To see why, we first consider something analogous to Fermat’s Little Theorem. Consider:

$x, 2x, 3x, ...\ , \frac{p-1}{2} x \hbox{ mod p}$

Not only are they all different mod p, but much better–there’s exactly one of these multiples of $x$ from each pair:

${\bf \{1,-1\}, \{2,-2\}, \{3,-3\}, \cdots, \{ \frac{p-1}{2}, -\frac{p-1}{2} \}}$

This is not too hard to prove and it’s fun. It will be crucial to remember later during the “reciprocity part” of the post.  This next fact isn’t easy, and it’s amazing:

$x$ is a square if and only if the “negative option” is chosen an even number of times.

The number of times the negative option is chosen is important enough to give it a name, $n$  used for this purpose for the rest of this post. As an example, for p=11 we ask: is 9 a square? (of course it is, $3^2=9$, but just play along!) The multiples of 9 are: 9, 18, 27, 36 and 45. Respectively, they reduce (mod 11) to:

${\bf -2, -4}, 5, 3 \hbox{ and } 1.$

With two negative options, $n=2$, which is even, so 9 is a square mod 11.

The explanation is similar to that of FLT,

$(x)(2x)(3x) \cdots (\frac{p-1}{2} x)=(-1)^n(\frac{p-1}{2}) ! \mod p$

Working mod p, we cancel the $(\frac{p-1}{2})!$ from both sides to get:

$x^\frac{p-1}{2}=(-1)^n.$

Combine this with Euler’s Criterion and we can now determine which numbers are squares modulo our odd prime p:

Key Fact:

$x$ is a square mod p if and only if $n$ is even.

If $x=2$, then it is a fun, tractable exercise (class project to help students discover strong induction?) to determine which primes have $n$ even, namely, $p=1,7 \mod 8.$

This is part of the law—for any odd prime $p$, we know that 2 is a square mod p if and only if $p=1 \hbox{ or } 7 \hbox{ mod 8.}$

Context is everything

One of the really cool things about quadratic reciprocity is that one needs to keep changing perspectives and contexts to understand it. This makes it a great introduction to modern mathematics. Take for example the notion of parity, which means evenness or oddness of an integer. On the surface, 5 and -5 have the same parity. In the above discussion, we considered sets {1,-1}, {2,-2}, … viewed mod p. If p is an odd prime, then -2 really means p-2 which has the opposite parity of 2. Perhaps these sets should have been written {1,p-1}, {2, p-2}, … This awareness of multiple perspectives helps with what follows.

The reciprocity part

The last part of the law of quadratic reciprocity concerns two odd primes p and q. It connects the answers to the two questions:

1. Is q a square mod p?
2. Is p a square mod q?

So, should we work mod p or mod q? We’ll lean towards mod p but really we’ll use neither!

Any positive integer $a$ can be decomposed as a “big part” plus a “remainder” with respect to division by p. That is:

$a=p \big [ \frac{a}{p} \big ]+r$

where [x] denotes the floor of x and $0\le r\le p-1 .$

So far we’ve been focusing on the “remainder” and ignoring the “big part.” We’ll do this decomposition for our usual set of numbers, namely  $a, 2a, 3a, \cdots , (\frac{p-1}{2})a .$

This time, we’ll add them, grouping the “big parts” together and “remainders” after.

As an illustrative example, we take $p=11, a=7 .$

We’ll take out a common factor of 7 on the left hand side, and a factor of 11 from the “big parts” on the right hand side. Thus:

${\bf 7(1+2+3+4+5)=}$

${\bf 11\big( [\frac{7}{11}]+[\frac{14}{11}]+[\frac{21}{11}]+[\frac{28}{11}]+[\frac{35}{11}] \big) +(7+}3+{\bf 10+6+}2{\bf)}$

From this simple example equation, one can see the main point (keep reading to see how). Even though this is an equation of integers, we can draw on our experience working mod p and recognize $n=3$ in the sense that 3 of the remainders had the “negative option.” Indeed, imagining mod 11, we know that 7 is not a square and we identify  $7=-4, 10=-1 \hbox{ and } 6=-5.$

Yet, crucially, our example equation isn’t mod 11—it’s an equation of integers. Let’s reduce it mod 2, where multiplication by an odd number has no effect (so we ignore the factor of 7 on the left and the factor of 11 on part of the right). Thus, viewed mod 2:

${\bf (1+2+3+4+5)}=$

$\big( [\frac{7}{11}]+[\frac{14}{11}]+[\frac{21}{11}]+[\frac{28}{11}]+[\frac{35}{11}] \big) +{\bf (7+3+10+6+2)}$

Now focus on the remainder part: (7+3+10+6+2). Compare its parity with what we’d have if we never needed the negative option, ie. (4+3+1+5+2). The difference is evidently $n \hbox{ mod 2}$ — one parity change for each negative option.

But the left hand side must be (a permuted version of) (4+3+1+5+2). Thus:

$0=[\frac{7}{11}]+[\frac{14}{11}]+[\frac{21}{11}]+[\frac{28}{11}]+[\frac{35}{11}]+{\bf \emph{n}}$

or, since we’re working mod 2:

$n=[\frac{7}{11}]+[\frac{14}{11}]+[\frac{21}{11}]+[\frac{28}{11}]+[\frac{35}{11}]$

In other words, the parity of the sum of the quotients is the same as the parity of  $n$,  which (by what we called “Key Fact”) tells us if $a$ is a square! So, 7 is not a square mod 11, because $n=3$ which is odd—but equivalently—because the sum of quotients

$[\frac{7}{11}]+[\frac{14}{11}]+[\frac{21}{11}]+[\frac{28}{11}]+[\frac{35}{11}]=0+1+1+2+3$

is also odd.

This will always work provided p is an odd prime and $a$ is an odd positive integer.

Therefore, a fundamental property about remainders has been wonderfully transformed into one regarding quotients. This is extremely interesting in its own right, and is also the main point. Now to state the theorem and clarify what’s left to show.

Theorem: For odd primes p and q, the two questions

1. Is q a square mod p?
2. Is p a square mod q?

have the same answer if and only if $\frac{p-1}{2} \frac{q-1}{2}$ is even.

We have seen how the answer to question 1 is controlled by the parity of:

$[\frac{q}{p}]+[\frac{2q}{p}]+\cdots +[\frac{((p-1)/2)q}{p}]$

Since both are odd primes, we can reverse the roles of p and q and the answer to question 2 is controlled by the parity of

$[\frac{p}{q}]+[\frac{2p}{q}]+\cdots +[\frac{((q-1)/2)p}{q}]$

All that remains is to show that the sum of these two expressions has the same parity as

$\frac{p-1}{2} \frac{q-1}{2}.$

But more is true: they are exactly equal. While remarkable, this equality is tractable and it may be possible (with hints!) to set this up as a class project, perhaps in a pre-calculus class after a discussion of inverse functions. But that’s a subject for another post.

We finish here by merely illustrating the equality on our example p=11, q=7. One sum of quotients we have already calculated as 0+1+1+2+3=7, the other is

$[\frac{11}{7}]+[\frac{22}{7}] +[\frac{33}{7}]=8.$

Also for this example $\frac{p-1}{2} \frac{q-1}{2}=(5)(3)=15.$

Perfect.

Fun with Fractions—from elementary arithmetic to the Putnam Competition—the first 1/2.

Which number is smaller, 1/6 or 1/7? The answer should be intuitively clear (without needing to find a common denominator!) and therefore it is instructive as well. Now let’s make it more exciting by taking it as far as we can.

First, we make it into a general fact:

Fact 1: Whenever n is a positive real number, 1/(n+1) < 1/n.

Furthermore, since 1/7 < 1/6, we can subtract each fraction from 1 to see that 6/7 is closer to 1 than 5/6 is; we easily deduce that 5/6 < 6/7 and get a new fact! Another tool for the toolbox of the intuition:

Fact 2: Whenever n is a positive real number, (n-1)/n < n/(n+1).

Now for a more exciting question:

Find integers a and b such that

5/6 < a/b < 6/7.

The most common (reflex) reaction is to find the common denominator (42), and to find a fraction between 35/42 and 36/42 (for example, 71/84).

But let’s make this problem much more interesting by requiring that you find the fraction a/b with the smallest possible integer denominator. A denominator much smaller than 84 would surely mean a numerator much smaller than 71.

Thus, inspired by our two Facts, (and eager to use them wherever we can) let’s focus on the numerator.

Warm-up question 1: Is it possible that a=b-1? (can our in-between fraction a/b have a numerator that’s 1 less than its denominator? Recall 5/6 < a/b < 6/7.)

Fact 2 is very well-suited for assisting with the answer this question! Indeed, for a/b to satisfy both inequalities, b would need to be between 6 and 7. Our first warm-up question was even a bit silly. Undeterred we ask:

Warm-up question 2: Is it possible that a=b-2? (Recall 5/6 < a/b < 6/7.)

As an example, we consider b=8. One of the inequalities holds, namely 6/8 < 6/7, which we see by using Fact 1 with n=7. However, the left inequality does not hold: 5/6 is not smaller than 6/8. Why? Because 6/8 = 3/4 and 3/4 < 5/6 (by Fact 2).

Let’s try another example, b=9. By way of analogy, we expect 7/9 < 6/7 but we don’t believe the other inequality (i.e. we expect that 5/6 > 7/9). Last time, we reduced a fraction, noticing that 6/8 = 3/4, so that we could use Fact 2. It doesn’t seem that we have such luck here; but we do: 7/9 = 3.5/4.5, and Fact 2 works for all real numbers; we see exactly where 3.5/4.5 fits!

If we want to use Fact 2 on the fraction a/b we would, in the same way, “reduce” it by using half of the original numerator a and half of the original denominator b. The advantage is that Fact 2 now explains everything. In particular, the new denominator b/2 must be strictly between 6 and 7. Since b is an integer, b/2=6.5. Thus, b=13. Our fraction is thus found:

5/6 < 5.5/6.5 < 6/7… Whoops, I mean:

5/6 < 11/13 < 6/7.

But is 13 the smallest possible (positive integer) denominator?  The answer lies with:

Warm-up question 3: Is it possible for a < b-2? (Recall 5/6 < a/b < 6/7.)

At this point, the question is hardly a “warm-up” but I’m calling it that, since it’s similar to the other warm-up questions.

Another way of asking the same thing: Is 2 the smallest option for b-a? We know b-a is at least 2 and that 13 is the smallest denominator with b-a=2.  Could b-a=3? At this point the same style of argument works. Thus, we “reduce” the fraction by replacing the numerator a with a new numerator (1/3)a and replace the denominator b with new denominator b/3. Even if the denominator is no longer an integer, the numerator is 1 less than it, so we can still use Fact 2. With b/3 > 6, we see that b > 18. If b-a > 3 then b would be even bigger. Thus, if we are looking for the smallest denominator, then b-a=2.

Now for a similar question, from the 54th William Lowell Putnam Mathematical Competition for Undergraduate students:

Question from the 54th Putnam:

Find the smallest positive integer b such that for every integer m with 0<m<1993, there exists an integer a for which m/1993 < a/b < (m+1)/1994.

It’s not quite the same as the question we’ve been considering, but if we alter the question by fixing m=1992 then it’s fairly close to it!

Indeed, we just finished a miniature version of the m=1992 special case, by discovering that 5/6 < a/b < 6/7 has a solution with b=6+7. Thus, we might guess that b=1993+1994=3987 is the answer for the Putnam Question. It’s suggested by the special case where m=1992. But will this choice of b work for every m?

Let’s return to our smaller question and make the comparison with the Putnam question more exact.

Small version of Putnam Question:

Find the smallest positive integer b such that for every integer m with 0<m<6, there exists an integer a for which m/6 < a/b < (m+1)/7.

We know from our work that the answer is at least b=13 (to accommodate m=5) and we expect the answer is b=13. To be sure, we need to know that there are integer answers to the following questions:

Question 1: 1/6 < ?/13 < 2/7 (corresponds with m=1)

Question 2: 2/6 < ?/13 < 3/7 (m=2)

Question 3: 3/6 < ?/13 < 4/7 (m=3)

Question 4: 4/6 < ?/13 < 5/7 (m=4)

We’ll focus on Question 1 and the rest will fall into place. We want to know if:

1/6 < a/13 < 2/7 has an integer solution for a. We need a slightly improved version of Fact 1:

Fact 1 (improved): If m > n > 0 then 1/m < 1/n. (The previous version essentially had m=n+1. We will need m=n+0.5).

We use this by observing that 2/13 = 1/6.5; hence we see that 2/13 is too small (i.e. a>2). On the other hand, since 4/13 = 2/6.5, we know that 4/13 is too big and hence a<4. (We used the improved version of Fact 1 again). Hence a=3 is the only possibility. We check if it works:

1/6 < 3/13 < 2/7. This may look more appealing to us (based on our work so far) if we write:

1/6 < 1.5/6.5 < 2/7. Perhaps one more equivalent form:

2/12 < 3/13 < 4/14.

This looks very compelling and suggests another fact:

Fact 3: Assume y > x > 0. Then x/y < (x+1)/(y+1).

This can be very intuitive. If you follow any sports team who wins x of their first y games, and they win their next game, their proportion of games won has surely gone up! The proof is almost as easy. Read it backwards:

Proof of Fact 3:

x<y

x+yx<y+yx

(1+y)x<y(1+x)

x/y < (x+1)/(1+y)

End of proof.

Question 2 is now easy, since Fact 3 yields

4/12 < 5/13 < 6/14.

Question 3 and 4 are almost the same.

Thus, while we are still far away from a formal proof; our made-up “small version” of the Putnam Question is not that hard; the actual Putnam Question works on the same principles (try it!). It’s fun if we’ve been thinking about the ideas. A formal solution and full proof can be quite short and elegant. Three such solutions are given in “The William Lowell Putnam Mathematical Competition 1985-2000” by Kedlaya, Poonen and Vakil. Our above discussion is an attempt to make the problem seem like a natural extension of very elementary ideas.

Notes:

1. If a, b, c and d are all positive real numbers, and if a/b < c/d then a/b < (a+c)/(b+d) < c/d. This is a popular, more general version of Fact 3 and just as easy to prove. We will explore this further in an upcoming follow-up post, “Fun with Fractions—from elementary arithmetic to the Putnam Competition—the second 1/2.”
2. Among many (social) circles, Putnam Questions have a reputation of being too difficult to be useful. My attitude is that on every Putnam Competition, there is at least one question per session where a less experienced problem-solver can play with the ideas and learn a great deal, even without solving it. This may be such a problem.

Reflections on the 2018 Joint Mathematics Meetings in San Diego California.

The American Mathematical Society is big; so is the Mathematical Association of America. Once a year, these giant organizations have a meeting welcoming all members of both organizations. This year there were over 6000 registered participants. That’s a lot of mathematicians!

The meeting is so big, it’s almost too big for some of us shy people. It’s surprisingly easy to get lost in a crowd. Thus, it had been more than 10 years since I had been to one of these mammoth events. To my surprise, I found it friendly and inviting. I knew several people, accidentally ran into a few others (this was wonderful) and met many people as well. The exhibits were excellent and inspiring. The undergraduate poster session had more than 400 posters. The students I talked to about their work were all impressive, even outstanding.

I was there for a purpose, giving a talk at a special session (on this preprint). At this kind of conference, there are always many, many things going on at the same time. It’s possible to go to your little area and not much else. Fortunately, my stay outlasted my special session, so I had lots of opportunities to look around. There are plenary talks and award winning talks, intended for the widest possible audience. But instead of going to these, I deliberately went to small sessions; talks here and there, partly randomly, partly to see what other people in completely different areas talk about, and how they talk about it. By way of analogy, consider a tourist, who can certainly do touristy things, but it’s much more fun to mingle with the local people. The small sessions gave me some nice insights as to what people are doing and how hard everyone is trying. Some of these sessions were close to my interests and others were completely new. This type of exploration is rarely possible at a small, specialized conference. I wish I had talked to more people after their talks. I won’t wait ten years again!

P.S. On the day I returned home, my flight was delayed by a few hours. I decided to just stay at the airport. I had work to do, and no distractions. The airport was crowded (many flights were delayed). To my pleasant surprise, I met three people at the airport who had been at the meetings, including two people who sat down beside me at different times, asking if I was a mathematician–two of them were people whose talks I had attended!

Proofs are not only often beautiful but also necessary

Here are a just a few examples where patterns which appear to hold, start to fail for large numbers. Of course, as humans, we have a biased view of “large” in the context of numbers.

1. Consider the sequence, 12, 121, 1211, 12111, 121111, … etc. Are they all composite? Well the first 100 are. Is that enough evidence to be confident that they are all composite? Nope—you can find the first prime number in the sequence yourself, or see this post by Alon Amit, with this and other much better examples.
2. Primes at most x are more often congruent to 2 mod 3 (as opposed to 1 mod 3) until x=608, 981, 813, 029. (I.e., more than 600 billion. See page 203 of “A Friendly Introduction to Number Theory” by Silverman, 4th edition).
3. Fermat numbers, of the form, 2^(2^i) + 1 were known (by Fermat) to be prime for i=0, 1, 2, 3 and 4. So that’s only 5 numbers. But the fifth one, 65 537 is so big! Perhaps this influenced Fermat’s unfortunate (or perhaps fortunate?) decision to conjecture that they are all prime. Perhaps he even had a guess regarding the significance Gauss would find for his primes 150 or so years later. Well the next (at least) 28 Fermat numbers aren’t prime. Maybe there’s none others? (How embarrassing would that be?) Here is “compelling evidence” by Boklan and Conway that the ONLY Fermat numbers that are actually prime may be the first 5 Fermat numbers. Of course, since the paper of doesn’t prove that there are no more Fermat primes, it’s inclusion here is perhaps ironic. So it is currently conceivable that Fermat’s conjecture was not merely incorrect, but spectacularly so… Naturally it was the great Euler who originally factored the first composite Fermat number; 2^32+1=641×6700417.
4. However, even the great Euler had his own, almost maximally incorrect conjecture. Indeed, he hypothesized that there are mutually orthogonal Latin squares of order n if and only if n is not 2 mod 4. It turns out that there are none for n= 2 (this is trivial), 6 (this is hard, and surely where Euler put in some effort); but there is one for all other values of n. (Together, Bose, Parker and Shrikhande showed there are mutually orthogonal Latin squares for all even n congruent to 2 mod 4, n>6). (See p. 122 “Introduction to Combinatorial Designs”, Wallis, 2nd edition).
5. Graph labeling problems are a great place to find other potential “maximally incorrect” conjectures. Gallian has a wonderful survey of recent work, progressing so fast that it would otherwise be quite difficult to keep track of it.

Here’s one of personal significance: Conjecture 1 from this beautiful paper of Gray and MacDougall is very reminiscent of Euler’s conjecture.

The main part of this conjecture is that certain graph labelings exist; but the negative part–that some don’t–fail, completely, as soon as they stopped checking! Here’s the constructive proof. (I’m still proud of this contribution, with Jeremy Holden and James McQuillan.)

1. Another example (this one famous) involves the product of sums of squares with a strikingly wrong conjecture. After Degen provided a formula for showing that the sum of 8 squares multiplied by the sum of 8 squares is again the sum of 8 squares, he believed, quite incorrectly, that there were similar larger formulas for sums of 2^k squares for any k.

Hurwitz proved that the (bilinear) formulas for sums of 2, 4, and 8 squares known to Degen were the only ones. Oops. (The analogous formulas for sums of 2 squares and sums of 4 squares are also significant for complex numbers and quaternions, respectively

1. The circle division problem is often used as an excellent example of how small examples may suggest a striking, yet incorrect answer.
Given n points on a circle, a chord for each pair; chords divide the circle into how many regions? (The maximum number possible). For n=2, 3, 4, 5, 6 it’s 2, 4, 8, 16 and— 31. That’s 31 regions, not 32. The answer is the sum of binomial coefficients (n,4) + (n,2) + 1. It’s so tempting to think it’s a power of 2.

Here is an excellent video for this example by Grant Sanderson.

8. So now we wonder about our research:

The crossing number of the complete graph with n vertices is conjectured to be the square of a particular binomial coefficient for n odd. This seems very compelling, partly just because a binomial coefficient is a really simple expression for such a complicated question.

More precisely, if n=2m+1, crossing # of K_n is conjectured to be the square of the binomial (m,2). It’s true for many drawings, e.g., these ones.

And yet, the smallest unknown case is just m=6. With Pan and Richter, we provide a few clues for the first unknown case. My best shot at understanding so far, (with Richter, and work that I’m very proud of) is summarized here.

I expect this conjecture will be verified for m=6, but (personally) feel very unsure of the conjecture beyond m=6.

So in conclusion…

I hope you get the idea. When mathematicians seem obsessed with rigor, it is at least in part due to our history of making mistakes, seemingly every time we tried to jump to a conclusion. But perhaps there’s something more. So for that, we end with a quote from William Thurston:

what we are doing is finding ways for people to understand and think about mathematics… what they really want is usually not some collection of of ‘answers’-what they want is understanding.

Math is fun–an efficient formula versus an expansive thought process

Our introductory mathematics courses are filled with finished products—not only are our proofs and procedures refined, but so too are the formulae themselves.

This post gives one example of such a perfect formula along with the question: what might we learn from starting with a much weaker version of it?

Geometric Series

The formula for the sum of a geometric series with first term a and common ratio r, is well known. Perhaps you can write it down. The formula is perfect—it provides answers efficiently and the proof is easy to understand.

Now pretend that you don’t know it. Instead you are familiar with just one infinite sum:

Perhaps you discovered it on your own and don’t quite know what it means (suppose you haven’t yet learned about convergence). Nevertheless, let’s see how far you can get with it—an imaginary contest, if you will, between the use of the standard perfect formula and your smaller formula (the one above).

The most famous geometric series is, of course:

Eg.1:

Your formula, with x=2, wins—no contest. (I’m deliberately not providing answers). How about:

Eg. 2:

Well there are two options for how to adapt your formula. Either add 1 to the answer from eg.1, or else multiply every term from eg. 1, (and hence also the answer).

After experimenting on your own examples, you may find that multiplying every term (and hence the answer) by a constant is very useful!

Now for an alternating series:

Eg. 3:

How about using x=–2 and then multiplying the terms (and hence the answer) by –1? Are there other strategies? Your formula still seems to have the advantage. Let’s try finite sums. This one has only 5 terms:

Eg. 4:

Piece of cake! Use the difference of two infinite sums.

Now for another finite sum and something really cool:

Eg 5:

2+4+8+16+32

Do we dare use the difference of two infinite series again? Let’s try:

Hmm. Using x=1/2 gives the right answer but some of the steps don’t seem right!

Here’s another option:

Easy—we’ve seen the sum in the brackets before.

The first attempt was still useful. It may help us to understand this one:

Eg 6:

Here set x=3/2 and the answer comes out right away.

Can you find an example where the standard perfect formula is really better? (Hint: I can’t). The restriction required for convergence (x>1) seems very natural. The process of adapting the formula by shifting it in various ways is familiar to a mathematician and it is fun; and now you have your own questions to answer: what question might lead a student to discover this simpler formula in the first place? Do you have a direct simple proof? Try it!

For the most awesome middle school and high school teachers

Dear awesome teacher,

Thank you for everything you put into good teaching. I know that’s it’s rare to see solid evidence that your extra efforts are really working. So here are two tiny examples of ways that teachers helped me, decades ago, probably dramatically, but probably in ways that would not be detected by any study on teaching effectiveness. I intersperse these examples with two equally tiny suggestions for mathematics teaching. My hope is that these suggestions may also be helpful.

1. How does one read out loud? How does one become prepared for calculus without learning calculus?

I might have been 6 years old. My classmates and I were challenged to read passages from a book. It sounded like we would read a word, then, eventually, read the next word. One day there was a fleeting comment from the teacher—telling us to have our eyes a little bit ahead of the word we were saying. Was the start of great reading? It probably took years to get good enough at it for it all to pay off. I don’t believe my teacher got better reviews for this transformational suggestion, and I don’t even remember her name. But the idea works everywhere. Sight reading on the violin?—easier. Preparing for your future?—it’s just more natural to anticipate the next step.

I would love to see more of these fleeting comments in grade school mathematics classes! Here is a possible example involving the formulas for the area of a triangle and the area of a trapezoid. To make it easier, we’ll use a right angle triangle. One commonly used way to approach a trapezoid, is to view it as a difference of two triangles. (See the green trapezoid in figure 1).

The green area represents the difference in area between two triangles—both triangles have a vertex at the origin.

Now take it further—imagine a growing right angle triangle, one side formed by part of the x-axis, another side formed from part of the line y=cx, shown in figure 1. (The third side is parallel to the y-axis). Ask about the last bit of growth in the last second—measure the change. Calculus is all about measuring change, anticipating growth, looking slightly ahead, and having a measurement for it. I believe that middle school students will be inspired by gaining a sense of what calculus is, even without having to learn any of the details beyond the ones they are already learning anyway. I have no proof that this line of discussion would prepare any student for calculus courses 10 years later; but surely it can’t hurt; my sense is that many university students pass courses deprived of really knowing or caring why they’re doing what they’re doing. So your role in this process, and therefore your potential influence on your young students, may be much bigger than it appears. Much of this piece, especially “Example 1, Section 3” is accessible without too much prior knowledge of mathematics. While it was originally intended for calculus teachers, I now believe all math teachers could use a small part of it to help their super young students prepare for the idea of calculus.

1. How do we memorize? Why do we memorize? How do we generalize?

“Repeat it here, repeat it there, repeat it everywhere. Write it down, compare it with the correct version and ask yourself if the differences are important. Eventually you will discover what it all means.”

I don’t actually know if anyone told me exactly this, or where I learned to memorize this way; but almost for sure I was given advice like this, from somewhere, when I was very young. Don’t just assume that your students will simply pick this up. Tell them, show them, and it will help (although maybe not soon enough to help your teacher evaluations—sorry!). Here’s an example of how memorization in mathematics led to a possibly new, definitely elegant proof of an old theorem (stated near the end of the post). It’s another triangle fact—simple, yet sophisticated.

Memorize this fact: The distance from the vertex of an equilateral triangle to the opposite side does not depend on the choice of vertex.

What does this mean? Look at triangle ABC in figure 2.

Figure 2. An equilateral triangle.

It means the (drawn) distance from B to AC is equal to the (not drawn) distance from C to AB.

“Repeat it here, repeat it there, repeat it everywhere…” What does our fact mean in figure 3?

It could mean that the distance from P to QB is the same as the distance from B to QP, provided that QBP is an equilateral triangle (so let’s assume that QP is parallel to AC).

Figure 3. Two equilateral triangles.

But equilateral triangle ABC is still in this picture, and guess what else is equal to the distance from B to AC?—the sum of distances from P to AB and from P to AC (see figure 4—can you see why?)

Figure 4. The sum of the lengths of the two dotted blue lines is equal to the altitude of triangle ABC.

By repeating our old fact in a different place, and with a bit of thinking, we now have a new, second fact: If P is on a side of an equilateral triangle (in this case side BC on triangle ABC), the sum of the two distances to the other two sides is the altitude of the equilateral triangle.

Repeat this second fact, looking at figure 5. “Eventually you will discover what it means.”

Figure 5. Viviani’s theorem.

This second fact—applied to equilateral triangle MBN and with P lying on MN—means that the sum of two distances from P to MB and P to BN is equal to the altitude of MBN. (As you guessed, MN is parallel to AC).

Now switch to triangle ABC—the sum of the three distances—from P to AB, from P to BC and from P to AC gives the altitude of ABC (see figure 5). Beautiful; we have:

Third fact: If P is any point in an equilateral triangle, then the sum of the three distances, from P to each of the three sides is equal to the altitude of the equilateral triangle.

This is known as Viviani’s theorem. For a more detailed discussion regarding this proof,  please see this preprint.

Memorization leads to generalization, which is essential for understanding how modern mathematics works. The importance of generalization, and therefore of memorization, can be communicated at very early stages of mathematical development.

Notes:

1. I first saw Viviani’s theorem in “A Friendly Mathematical Competition: 35 years of teamwork in Indiana,” a problem book edited by Rick Gillman, published by the MAA. Our proof of Viviani’s theorem and the resulting preprint mentioned above was a result of discussions with Dr. Armstrong after Putnam training sessions, using this book.
1. I’ve suggested “Example 1” from my paper (coauthored by Darlene Olsen). However, example 2 from the same paper addresses two issues. The first is that it provides a simple calculus-based explanation of the fact that the ratio of area of a circle to the square of its radius is the same as the ratio of its circumference to its diameter. However it also prepares students for the method of cylindrical shells—a common difficult part of a standard calculus course. In fact, an informal preview, 10 years before a calculus class, of the intuitive idea of example 2 could help (we have absolutely no statistical evidence of this). It does seem to me though, that this is one of the possible benefits of society asking that the people who teach math know something about where their students may be headed. I sincerely hope that our society asks for even more mathematics knowledge and passion from our mathematics teachers.
1. I recommend that you continue to follow your own thoughtful judgment regarding pedagogy. Research on pedagogical techniques (just like research on which foods are good for you) may take a lot of time to settle. Please keep in mind that research suggesting that teaching via memorization is not effective, may subtly assume that students already know how to memorize; if one experimental class focuses on something students aren’t getting elsewhere then it may appear to be far more valuable than it would be if, say, all classes were focusing on that same thing. Perhaps memorization will be considered far more valuable in the future when fewer teachers are focusing on it! My suggestion is that you continue to try to do everything well and with a purpose, whether it is memorization, inquiry based learning or some combination of many techniques. Thanks!

Let’s Integrate Calculus

Calculus is a full-brained subject being taught by left-brained instructors to right-brained students. I see many posts lamenting the lack of student understanding resulting from the standard way that a calculus course is taught.

For example, Tevian Dray suggests:

“Skip the fine print. Emphasize examples…Emphasize the need to be fluent with multiple representations…[emphasize] geometric reasoning as the key to conceptual understanding.”

Another recent post by Katie Courage summarizes a study featuring the lack of understanding of calculus I (not a lack of performance in Calculus I!) as a major reason cited by women for dropping out of STEM majors.

So, in order to facilitate an engagement of our full-brains, I would like to encourage the use of imagination and do a better job promoting conceptual understanding.

My main effort in this direction resulted in the (free and open-access Journal of Humanistic Mathematics) paper entitled, “A Truly Beautiful Theorem: Demonstrating the Magnificence of the Fundamental Theorem of Calculus” co-authored with Darlene Olsen, who is a Homer L. Dodge award winner for University Teaching Excellence at Norwich University. Her efforts, and excellence as an expositor ensured the writing of that paper is of high quality!

Before reading it, you may ask yourself (or your students) why the derivative of the area of a circle should be the circumference of the circle, or if you really could convincingly explain the formula for the volume of a cone, or if you’re happy with how students absorb related rates. Do students think of the chain rule as merely a method for calculating derivatives, or can they use it backwards? Do they really get the First Part of the Fundamental Theorem of Calculus? Can you think of a useful classroom example that can be used in the first week of the course, which can illustrate all of the essential features of the course? Ok don’t wait—please read our paper now!

One thing we challenge is the following dogma:

“Limits must be taught before derivatives. Derivatives must be taught before integrals. Riemann sums must be taught before definite integrals.”

For this dogma is simply not true. It’s not true historically and it is very strange indeed that our method of teaching calculus is so far removed from the human thinking that led to its discovery.

In fact I see no advantage in hiding derivatives from students while teaching limits. I see plenty of advantages to revisiting the Fundamental Theorem of Calculus many times in different contexts throughout an introductory course. Our paper suggests very minor changes to an introductory course—so they would be very easy to implement for most experienced teachers. We believe such changes would substantially alter the perception that a student would have of calculus.

A few carefully chosen examples may help to connect limits, derivatives and integrals, leading to a more enriched view of the subject.

So instead of introducing the main topics strictly sequentially, please try—at least in a few more places—to introduce them simultaneously. In other words, let’s integrate calculus! Please also provide me with feedback—I’d love to know how it works out.

Posted in Uncategorized | 4 Comments

A Parity Theorem For Drawings of Complete Graphs

This post gives a (hopefully) very accessible treatment of a beautiful mathematical fact of topological graph theory, called the “parity theorem” of Kleitman. (Two numbers have the same parity if they are either both even or both odd). (Please see this paper for a more conventional discussion).

Pick an odd positive integer, n. Draw the “complete graph” with n vertices (you may think of the vertices as dots, and between any two of them we draw a line. The lines do not need to be straight lines. The lines are called edges. The complete graph with n vertices is denoted K_n). In figure 1 we see two drawings of K_5, the complete graph with 5 vertices. Notice that both drawings (figure 1a and figure 1b) have an odd number of crossings.

In figure 2a there is drawing of K_5 with 4 crossings. However figure 2a has something that’s deemed a bit silly—two of the crossing edges (the red edges) both share a vertex (the bottom right vertex).

In general, there is a technical definition of a “good drawing” that prevents exactly this. There is a reason for this restriction: if two crossing edges are incident with the same vertex, then the drawing can be slightly modified to remove that crossing, without introducing any new crossings in the rest of the drawing, shown abstractly in figure 3a. It is common to ignore a drawing if it is “not good” (a drawing that is not good cannot be part of a drawing of K_n with the fewest number of crossings, for example).

Goodness also requires that edges don’t touch each other tangentially (figure 3b), a pair of edges don’t cross each other more than once (figure 3c), and that no edge crosses itself. If three edges happen to cross at the same point we consider it to be 3 crossings (some will simply insist that 3 edges do not cross at the same point, via a slight transition shown in figure 4, that would not affect any other crossings in the drawing).

Figure 2b shows what happens to the drawing in figure 2a after removing the crossing that is not “good,” and guess what? There are an odd number of crossings in the good drawing shown in figure 2b.

We will soon show that every good drawing of K_5 has an odd number of crossings. First, an easy fact: Every good drawing of K_3 must be without any edge crossings. Furthermore every good drawing of K_3 has an “inside” and an “outside”. If beyond the K_3 there are two other vertices, u and v, such that u and v are both on the same side of the K_3 (in figure 5 they are both outside the K_3), then an edge joining u to v must cross the edges of the K_3 an even number of times in total (possibly zero times if uv doesn’t cross the red K_3 at all. However, if you start to draw uv starting from u, and if your edge uv enters the inside of the red K_3 by crossing a red edge, then it must also return to the outside of the red K_3 to get to v—resulting in a second crossing involving uv. So if there is a crossing involving uv, there must be at least 2 crossings of uv with the red K_3, etc…) Similarly, if u and v are on opposite sides of the K_3 (one is outside but the other is inside) then the edge uv crosses the K_3 an odd number of times. (This simple idea will come in handy when we analyze figure 7 below).

Next, to review and reinforce your understanding of the “goodness” criteria, convince yourself that every good drawing of K_4 has at most 1 crossing. If you try to draw K_4 with two or more crossings, there will be a goodness violation (probably there will be one vertex incident with 2 crossing edges—there are only 4 vertices, so check them all!)

In any case, the upshot is that each crossing requires 4 vertices and actually, the crossings in a drawing of K_5 can be counted by counting the K_4’s that have a crossing.

This may seem much harder than just counting the crossings, but it works well on an abstract level—that is, when we don’t know what the drawing looks like, but still have to prove things about it. This is what makes the theorem and its proof beautiful—we will prove there is something common to infinitely many different drawings of K_5.

To make this abstract approach more concrete, look at figure 1b–if we remove the top vertex and its incident edges, we end up with a K_4 with a crossing in it (figure 1c). If on the other hand, we start with figure 1b and remove the bottom vertex and its incident edges, we get a K_4 that does not have a crossing (figure 1d).

There are three other vertices we could have deleted to get a K_4 (try to draw them!) Only one of the five vertex-deleted K_4’s has a crossing. We see in a different way (other than directly counting crossings) that there is one crossing in the K_5.

Now consider figure 1a. If we delete the bottom vertex and all its incident edges, we are left with a K_4 with 1 edge crossing–the crossing highlighted in red.

As there are 5 vertices you could have deleted, there are 5 different drawings you need to make from figure 1a, each with a different crossing (try it, at least mentally), so we see in a different way that there are 5 crossings in figure 1a.

For a given drawing, if all five of the vertex-deleted K_4’s have a crossing, then the K_5 has 5 crossings, which is an odd number. A good drawing of K_5 either has 5 crossings or else we can find a K_4 inside it so that if we just look at that K_4, it does not have crossings. This motivates:

Lemma: Suppose a drawing D of K_4 has no edge crossings. Suppose that D is part of a good drawing D’ of K_5. Then D’ has an odd number of crossings.

We have an example of a drawing of K_4 without crossings (see figure 6). There is an extra vertex v that we wish to use to extend the K_4 to a drawing of K_5. Although this is a specific picture, there are features of it that are always true. In particular, deleting the edges incident with a specified vertex of the K_4 will result in a K_3 with an inside and an outside (we obtain figure 7a from figure 6 by deleting the edges incident with the vertex ‘a’. Both ‘a’ and ‘v’ are outside the remaining K_3 with vertices b,c and d). This looks like figure 5!

If we wish to count the total number of crossings of a K_5 that involve the edge va, then we should not look at the other edges incident with ‘a’, since the edge va can’t cross them anyway (by goodness–two edges incident with ‘a’ cannot cross each other). Also, va can’t cross the other edges incident with ‘v’ (also by goodness–two edges incident with ‘v’ cannot cross each other), so we won’t worry that we can’t see the edges incident with ‘v’ in figure 7a.

When we count the number of times va crosses anything in the K_5, it must be an even number since ‘v’ and ‘a’ are both on the outside of the K_3 with vertices b,c and d. A similar argument (looking at figure 7b and the K_3 with vertices a,c and d) shows that vb crosses an even number of times. (Note that the va crossings are all different from the vb crossings, by goodness again!).

The number of times vc crosses in D’ is similarly even (see figure 7c); but the number of times vd crosses in D’ is odd, since v and d are on opposite sides of the K_3 made up of a,b and c as shown in figure 7d.

To summarize: All of the crossings of D’ will involve v, since D has no crossings. We add together the number of times each of the edges va, vb, vc and vd cross anything in D’ to get the total number of crossings in D’. Exactly three of these four numbers are even, so the sum is odd. By “goodness,” these numbers represent distinct crossings, (no single crossing can involve two edges incident with ‘v’) and as such these crossings can be added together meaningfully.

All of these facts depended only on the fact that we started with a drawing of K_4 with no crossings, and not the particular drawing in figure 6. (Imagine ‘v’ had been in a different location, for example, very close to ‘d’ inside the K_3 bounded by a,b and c. Check that the argument still works!)

This essentially proves the lemma. Together with the discussion before the lemma, we now know that every good drawing of K_5 has an odd number of crossings. We’ve also seen that every good drawing of K_3 has zero crossings (and zero is even, so “every good drawing of K_3 has the same parity as every other good drawing of K_3” is a true statement) and we know that K_4 has a good drawing with 0 crossings and another good drawing with 1 crossing (so no parity theorem for good drawings of K_4).

Based on these 3 (measly) examples, n=3, n=4 and n=5, we are now ready to state the theorem. For it’s short proof, miraculously, no more geometry is needed:

Theorem (Kleitman): Let n be an odd integer, and D be a good drawing of K_n. Then the number of crossings in D has the same parity as the binomial coefficient “n choose 5”.

Before getting to the proof, here is an example with n=7 to illustrate the idea of the proof. If you have 7 objects, there are 21 ways to select exactly 5 of them (try it). So we say “7 choose 5” is 21. In our context, we count crossings in a K_7 by separately counting the number of crossings in each of the twenty-one K_5’s in the K_7. (This counts each crossing more than once—hmm…but we don’t want the number of crossings, we want the parity of the number of crossings… More explanation to follow!)

Each of these twenty-one numbers is odd (since each of the twenty-one K_5’s has an odd number of crossings). So the sum of the 21 numbers is also odd. When we add these 21 numbers, every crossing would have been counted 3 times (a crossing requires 4 vertices; when you look at a single crossing it therefore belongs to 7-4=3 different K_5’s). So our odd sum of 21 numbers, divided by three, is the number of crossings in the K_7. We don’t know what the number of crossings is (that’s why we didn’t try to count the exact number of crossings), but we know the number is odd, since an integer that is obtained from odd integer divided by 3 is necessarily odd.

In general, to find the number of crossings in any good drawing of K_n, we add “n choose 5” different (unknown, but definitely odd) numbers together and divide by the odd number (n-4) to get the exact number of crossings…And we don’t know what the exact number is but we also don’t care; the sum of “n choose 5” odd numbers must have the same parity as “n choose 5” (regardless of what integer “n choose 5” turns out to be). If this sum is divided by an odd integer and the quotient happens to be another integer, then this quotient also has the same parity as “n choose 5.” But we divide by n-4, which is odd since n is odd. So this completes the proof.

Remarks:

1. This presentation is a first attempt at a response to the criticism: “Why don’t mathematicians or scientists ever write the way they think?” In fact, this “unnecessarily” long exposition does not mean that it is difficult; my hope is quite the opposite. The length of this post is an attempt make it as lucid as possible, for as large an audience as possible. I welcome feedback to improve the clarity, and to reduce the prior expectations on the reader.
2. This discussion is much different from Kleitman’s original approach. For a full discussion, please see the paper: A Parity Theorem for Drawings of Complete and Complete Bipartite Graphs, by Dan McQuillan and R. Bruce Richter, in the American Mathematical Monthly, Vol. 117, No. 3 (March 2010), pp. 267-273. Furthermore, there are several other topics in the paper not covered in this post. Please check it out and enjoy!

Run For Third!

Imagine that it’s a tie game in the bottom of the ninth inning and there is one out. You are a base runner on first base. The outfield is playing back to prevent a double and the hitter hits a shallow fly ball into right field. There is no way to tell if the right fielder will catch it. You must wait between first base and second base to see if the ball will be caught. This is to guarantee that if the ball is caught, you may return to first base without an embarrassing double play, ending the inning.

Except that this strategy lowers your chances of scoring a run. We can prove it, and we did so in the April 2016 edition of Math Horizons, a publication of the Mathematical Association of America (MAA). (Our article, Run for Third! A Defense of Aggressive Base Running is free to members of the MAA, and most university libraries will subscribe to the journal Math Horizons).

The fact is that at the major league level, your chances of scoring from third base with one out are so much higher than your chances of scoring from second base with one out, that you should run even if you don’t know whether or not the ball will be caught.

To be more specific, if I (an aggressive base runner) see the ball hit to right field, and don’t know if the outfielder will get there on time, I will run and I might make it to third base instead of second base. Sometimes this aggressiveness will result in a double play. If you play the conventional way, you may end up at second base. Sometimes (if the ball is caught) you will end up at first base with TWO outs.

The analysis is therefore simple. Compare the advantage of my way versus the advantage of your way. If the ball falls in for a hit, I have the advantage. I’m at third base, where major leaguers score (with one out) more than 60% of the time. You are at second, where major leaguers score (with one out) less than 40% of the time.

If the ball is caught, you have the advantage. But it turns out this advantage is small. At the major league level, a base runner scores from first base with two outs about 14% of the time (only 13% in high leverage situations, like a tie game in the bottom of the ninth).

Each strategy has an advantage—mine when the ball falls in for a hit, and yours when the ball is caught. We can compare these two advantages by using a little bit of undergraduate mathematics (hence our choice of a famously student-accessible journal). It turns out that my advantage is greater than yours. It’s not close. As long as the game situation suggests that playing for one run is the correct strategy, then this version of aggressive base running is also the correct strategy.

This has been missed by modern statisticians simply because it hasn’t been tried. So please try it! *Please* let me know if you see a game where it was used, or a game where it should have been used and wasn’t.

Thanks to my coauthors, Peter Macdonald and Ian McQuillan. We’ve been talking about this for some time. We’ve tried posts before (like this one), as well as other preliminary preprint versions of the paper that were not as clear as this publication. So please read this article for the details!

Useful generalizations, Part II

Very often in mathematics a theorem can be easier to prove if it is viewed as a consequence of a different theorem, which may be far more general than the original.

Our example for this post appeared on the 2014 Collegiate Mathematics Competition, sponsored by the Northeastern Section of the Mathematical Association of America:

A husband and wife invited four married couples to a housewarming party. Just before everyone left, the husband asked each person (including his wife) how many people they had shaken hands with. Their responses were 8, 7, 6, 5, 4, 3, 2, 1, and 0. Given that no one shook hands with his or her own spouse and no pair of people shook hands more than once, determine how many times the husband must have shaken hands with a guest.

In fact this little puzzle is not too difficult, and you may wish to try it for a few minutes before reading further! (Can you do it in your head?)

— Empty space to encourage playing/thinking —

Congratulations! By now you’ve either solved the problem or else realized that it’s not that easy. Either way it’s a win—in both cases the next question is:

Can we make up an analogous question with fewer than four married couples invited to the party? Does the question make sense with just one married couple? With two married couples? With n married couples?

(If you haven’t solved the problem, the one-couple version is a very nice playground, which will hopefully help with the original. If you have solved it, then you may be in store for a better theorem and a different proof).

We don’t know yet whether or not this problem will actually generalize nicely. Nevertheless, for the one-couple version, let’s say the responses to the handshake question are 2, 1, and 0. With only 4 people present at this party, the person shaking 2 hands—not shaking the hand of the 0-handshake person—must shake everyone else’s hand. Hence the 2-handshake and 0-handshake people are married to each other. The 1-handshake person shakes hands with the 2-handshake person and nobody else. Now we have a complete picture of everyone’s handshakes except the husband. Fortunately that’s enough to recover all needed information about the husband, who shakes the hand of one person. His wife also shakes hands with one person.

Now for the two-couple version of the problem: assume the responses to the handshake question are 4, 3, 2, 1 and 0. We could proceed in the same way, forming a complete picture of who must shake everyone else’s hand, starting from the 4-handshake and 0-handshake people, who must be married to each other, etc. But here’s a different tact: once we realize that the 4-handshake and 0-handshake people form a married couple, we remove this couple—induction! Note that this removal necessarily reduces the number of handshakes by 1 for each remaining person. We are in fact reduced to the one-couple version of the problem. This suggests both a general theorem and its proof:

Theorem: Let G be a simple graph with 2n+2 vertices. Assume the degrees of 2n+1 of the vertices are given by the numbers 0, 1, 2, … , 2n. Then the last vertex must have degree n. Moreover, for i=1,2,…,n, the vertex of degree n−i is not adjacent to the vertex of degree n+i, and the two vertices of degree n are not adjacent to each other.

There are a couple of important objections. The first one is that this doesn’t look like the original problem; unless a vertex represents a person and an edge represents a handshake. We’ll address the other objection after the proof.

Proof: Let x be the unknown degree. We wish to prove x=n. We proceed by induction on n, the case n=1 being not too hard (and discussed above). We now assume the result true for n−1. Consider a graph G with 2n+2 vertices as in the hypothesis of the theorem. Delete the vertices of degrees 0 and 2n. This reduces the degrees of each remaining vertex by 1, leaving 2n vertices with degrees:

0, 1, … , 2n−2 and x−1.

By the inductive hypothesis x−1=n−1, the two vertices of degree n−1 are not adjacent to each other, and a vertex of degree n−1+i is not adjacent to the vertex of degree n−1−i (for i=1, 2, … n−1). Replacing the two deleted vertices from G gives the desired result. QED.

Now for the other objection: the hypothesis that no one shook the hand of his or her own spouse seems to have disappeared, suggesting this assumption was not necessary to solving the problem. Yet the conclusion of the theorem includes a description of vertices not adjacent to each other. So our theorem is better—it has a weaker hypothesis and it guarantees that the n-couple version of the problem describes a situation that could actually happen. If we do assume that no one shakes the hand of his or her own spouse, then one can show—by induction on i—that the person who shakes 2n-i hands is married to person who shakes i hands.

So this was much more involved then it had to be. Yet it is satisfying on many levels. We made a small (n=1) toy version of the problem, we provided a generalization and we showed that the hypothesis of the general problem is realizable.

Remarks:

1. The n=3 version of this problem appears in Loren C. Larson’s outstanding book, Problem-Solving Through Problems, in a section entitled, “Draw a Figure.” We chose this complementary, algebraic approach. It’s certainly instructive to compare the two approaches, to see exactly how the use of induction on n gets the same conclusion without explicitly drawing the entire picture.

2. Another written solution (closer to the one in Larson’s book) is available from the Collegiate Mathematics Competition webpage by looking up the practice problems from 2014, question 4.  Many thanks to my colleague Rob Poodiack for bringing my attention to these excellent problems.

3. This problem is perfectly adaptable to many different levels of difficulty and discussion. So it’s a perfect question to consider, even on a lazy summer afternoon. Enjoy!

Posted in Uncategorized | 1 Comment

Useful Generalizations, part I

Very often in mathematics we see a nice argument or proof and we realize that the same argument can prove more than what was originally intended. The purpose of this post is to do just that for two recent posts, “It’s a Mean Value Theorem” and “On the value of believing that you know the answer.”

In the post on the Mean Value Theorem (MVT), we proved as a warm-up example that:

7^(1/3)+9^(1/3) < 4

As a quick review, MVT states that if f(x) is a differentiable function on the open interval (a,b) and continuous on the closed interval [a,b], then there is at least one point z in (a,b) such that f′(z)=[f(b)−f(a)]/(b−a). We rewrite this as:

f(b)−f(a)=(b−a)f′(z) (MVT)

Now set f(x)=x^(1/3) observe that f′(x) is decreasing when x>0 and use MVT twice to get:

f(9)−f(8)<f(8)−f(7). Since f(8)=2, this last inequality gives the desired answer.

These two applications of MVT use different choices for b and a, but in both applications

b−a= 1. As a generalization, we imagine

b−a=d>0, with a+d≤b−d  and we consider any function whose derivative is decreasing:

Which is bigger, f(a)+f(b) or f(a+d)+f(b−d)?

By comparison, our original question had:

a=7, b=9 and d=1.

From two applications of MVT we see that

f(b)−f(b−d) < f(a+d)−f(a)

Hence we have our generalization:

f(a)+f(b)<f(a+d)+f(b−d)

But is this generalization useful?

Here’s where the post, “On the value of believing that you know the answer” comes in to play. The goal of that post was to prove the arithmetic-geometric mean inequality. However, instead of directly proving that the arithmetic average of n numbers was bigger than the n-th root of their product, we instead replaced two of the n numbers, a and b say, with two new numbers a+d and b−d (choosing d so that one of these new numbers was the arithmetic average).  This change maintained the same arithmetic average, and we showed that the n-th root of the product increased. Finitely many of these two-at-a-time changes produces n equal numbers, and the result follows from comparisons.

The same process works for a very analogous application; given n numbers, whose (arithmetic) average is A, then Jensen’s inequality states that

if f′(x) is decreasing then f(A)≥Z

where Z is the arithmetic average of the n outputs

i.e. Z = [f(x_1)+f(x_2)+…+f(x_n)]/n and

A=[x_1 + x_2 +…+x_n]/n).

If the n numbers are all equal, there is nothing to prove. Otherwise, let a be the smallest one, a<A and there must be another one b say, which is bigger than A. If A−a<b−A then set d=A−a. Otherwise set d=b−A. This guarantees that either a+d=A or b−d=A. To prove Jensen’s inequality, replace the two numbers a and b with a+d and b−d. The average of the n new numbers will still be A, but the average of the n outputs will be larger—Our generalization is useful! Hooray!

Fewer of the n numbers will be different from A. We repeat, similarly altering numbers different from A until there are none remaining.

Remarks

1. This proof is well-known, but this discussion suggests a natural way to find it.
2. It is also well-known (and by now probably not surprising) that setting f(x)=log x in Jensen’s inequality immediately yields the AM-GM inequality.
3. In “It’s a Mean Value Theorem” the main example was to show that in any acute triangle with angles A, B and C, that sinA+sinB+sinC>2. Let f(x)=sin x. Notice  f′(x) is decreasing for acute x. Hence by Jensen’s inequality, f((A+B+C)/3)≥(sinA+sinB+sinC)/3. Rearranging slightly gives us:

sinA+sinB+sinC≤3sin(π/3).

This is a nice upper bound to go with the lower bound of the previous post.

It’s a Mean Value Theorem

“The average rate of change will be the same as the instantaneous rate of change somewhere in the interval” under reasonable assumptions, is the gist of the Mean Value Theorem. A pleasant funny review is in the slow-paced 1966 (yes, 1966!) video “The Theorem of the Mean Policeman” . Slightly more technically, if f(x) is a differentiable function on the open interval (a,b) and continuous on the closed interval [a,b], then there is at least one point z in (a,b) such that

f′(z)=[f(b)—f(a)]/(b—a). In our applications below, we prefer to write this last equation as:

f(b)—f(a)=(b—a)f′(z) (MVT).

The purpose of this post is to see the Mean Value Theorem at work in a couple of places where we may not expect to see it. The second example is much more interesting than the first.

Example 1. (Warm-up example): Which is bigger

7^(1/3) + 9^(1/3) or 4?

If you have a calculator or a phone handy, it will probably tell you that the first expression is approximately 3.993. But that’s cheating. Instead consider the function

f(x)=x^(1/3). A straightforward application of the power rule shows that f′(x) is decreasing for x>0.  Hence (using MVT twice):

f(9)—f(8)<f(8)—f(7).

But since f(8)=2, this last inequality gives the desired answer.

Example 2: Prove that in any acute triangle with angles A, B and C, that

sinA+sinB+sinC>2.

I have a vague recollection of trying this problem as part of a Putnam training session as an undergraduate student. Recently I saw it again as a post on Stephen Cavadino’s blog.

In fact I recognize Stephen’s clever solution as being the same one presented at my Putnam club decades ago. It’s such a slick solution that it’s definitely worth reading.

Nevertheless, I’m not sure that I could have come up with it. So here’s an alternative, using MVT.

This solution hopefully also highlights the value of chosing good examples to help understand the problem.

How to solve it:

I experimented with the conditions of the problem, quickly realizing why the “acute” hypothesis is necessary (otherwise sinA+sinB+sinC can be made arbitrarily close to 0). However, the biggest angle could be a right angle and still have the inequality holding.

In fact, a right triangle with hypotenuse 1 makes for an extremely interesting special case. Indeed if C=π/2 (radian measure) then sinC=1, and our required inequality reduces to a special case of the triangle inequality.

Inspired by the simplicity of this example, I decided to compare an arbitrary acute triangle to it. Since we will be comparing different triangles T and T′, we

let f(T) be the function that adds the sines of the three angles of a triangle.

First let’s rank the acute angles; without loss of generality A≤B≤C.  Also, let

D=(π/2)—C. Observe that D<A. (This is because D+C+B<π, but A+C+B=π).

So consider a new triangle T′ with angles A—D, B and C+D (the side lengths won’t matter). The new triangle has a right angle (C+D=π/2) and so by our previous discussion f(T′)>2. It remains to show that

f(T) ≥ f(T′). This follows from:

Lemma: Let D>0 and assume that D ≤ A ≤ π/2—D. Then:

sin(A—D)+1≤sinA+sin(π/2—D) —D[cosA—cos(π/2—D)]

Assuming the lemma is true, we have:

sin(A—D)+1≤sinA+sin(π/2—D)

i.e., we imagine replacing two of the summands in f(T′) with the corresponding summands from f(T).

Hence f(T′) ≤ f(T).

To prove the lemma, we use the mean value theorem twice. Once to show:

sin(A)—sin(A—D) ≥ DcosA (exercise!)

and the other time to show:

sin(π/2)—sin(π/2—D) ≤ Dcos(π/2—D).

Combining these last two inequalities yields the lemma.

Notice that in our application of the lemma, we disregarded the term:

—D[cosA—cos(π/2—D)].  This extra term could easily be used to sharpen the inequality.

Another way to sharpen it is to compare our right triangle T′ with another

right triangle T′′ where the angle B in T′ is replaced by an angle approaching π/2.

This means that f(T′′) is approaching 2.

At the beginning, searching for an example where f would approach 2 was part of the process. Realizing how the inequality may fail to hold for obtuse triangles also suggested having a right triangle. So I see this solution as a natural consequence of playing with the conditions of the problem and experimenting with the right examples.

A cardinal rule of base-running is hilariously incorrect

Sometimes I watch Major League Baseball games. I’ve noticed that the following is never questioned:

Unwritten Rule: With fewer than 2 outs, a base-runner on first base shall be certain that a ball hit in the air will fall in for a hit before straying too far away from first base.

This is a natural consequence of a written rule allowing for double plays when the ball is caught and the base-runner doesn’t  “tag up” before advancing.

So…the purpose of this post is to identify one situation where the unwritten rule is wrong—so wrong that it costs the game.

One runner, on first base with 1 out

Imagine there is a runner on first base with 1 out. It’s a tie game in the bottom of the ninth inning and the outfielders are playing back to prevent a double. The ball is hit in the air, but off the end of the bat—a blooper into right field. The right fielder runs; no one can tell if he’ll get there in time to catch it, but if he does the base-runner better get back to first base to prevent a double play. If the ball falls in for a hit there will only be 1 out but the runner only makes it to second base, since he was standing and waiting to see if the outfielder would catch it.

Should the base-runner stand and wait to see if the ball is caught?

The answer is “no”, and it’s not even close. This is the risk/reward analysis that is apparently not understood. But it’s a really simple argument. When the ball is hit in the air, let’s say the runner makes a snap judgment that it’s “better than even” that the blooper will fall in for a hit. In other words, let p be the probability that the blooper will be a hit and the runner quickly decides p>0.5.  Our analysis involves two possible base-runners, A and B. Player A (A for “aggressive”) breaks protocol and just runs without waiting to see if the ball is caught. His aggressiveness either gets him to third base or else gets him thrown out. Player B (for “baseball professional”) follows protocol, and waits.

Who wins more often, A or B?

Remember it’s tied in the bottom of the ninth, so one run wins the game. If the probability of scoring from third base with 1 out is T (T for “third”) then the product pT is the probability that player A scores. This product captures the situation that the blooper falls in for a hit, and then somehow the base-runner eventually scores. (We don’t care how this happens, we just assume it happens with probability T). He won’t score if the blooper is caught since he’ll be doubled up and the inning will be over.

For player B there are two cases. The first case is that the blooper is a hit, putting him at second base. He could eventually score from second—again we don’t care how he might score from second, but we assume he does so with probability S. The second case is that B could score if the blooper is caught (although it would be much harder since there would be two outs and he would be at first base). His probability of scoring is therefore pS+(1-p)F where F is the probability of eventually scoring from first base with TWO outs.  So what’s bigger pT or pS+(1-p)F?

The punchline

The quantities T,S and F above can all be calculated! (We know how likely it is that a runner will score from third base with 1 out, etc). The short answer is that

pT>pS+(1-p)F whenever p>0.38.

So if you think there is a 38% chance or better that the blooper will be a hit, RUN!

Many more details are available from our preprint:

http://arxiv.org/pdf/1505.00456.pdf

Rally killers and Mariano Rivera

Here is the general justification for running like base-runner B (from  http://baseball.isport.com/baseball-guides/cardinal-rules-of-base-running):

“There’s no worse rally killer than a double play. Moreover, there are few plays more devastating for an offense than a line drive with runners on that turns into a double play. To avoid such a huge momentum shift, you need to make it a priority not to get doubled off on the bases.”

Well I don’t think it’s much of a rally if there are 2 outs and only one runner, on first base. In other words the rally is on life-support as soon as the blooper is caught. Furthermore, if it’s a tie game in the bottom of the ninth, you just need one run, not a rally.  In http://arxiv.org/pdf/1505.00456.pdf we consider “high-leverage innings” defined as the 8th or 9th inning where the difference in score is at most 1. Between the years of 2003 and 2008, Mariano Rivera pitched in 58 different high-leverage innings where there were 2 outs and one runner, on first. None of those runners scored. Unlike most pitchers (where 38% certainty is enough) one only needed to be 14% sure that the blooper would be a hit before running aggressively when Rivera pitched. This is not just because Rivera’s pitching record is better than other pitchers (although it certainly is!) but more importantly it is because he was only slightly better than average at preventing a runner scoring from third with 1 out. This is the point—you don’t need a hit to score from third with one out. In high-leverage situations, one is often facing the pitchers against whom hits are most difficult to obtain, and these are the situations that determine the outcomes of games.

Why don’t they do it this way?

The simple answer is that there is no direct data to show that player A wins more games, because no one ever does it that way. If a team tried it this way, then data could be collected to bear out the conclusions of our indirect arguments. We continue to wait for someone brave…

UPDATE: I’ve written a new post about this topic, in part because we’ve improved it enough that it ended up published in the April 2016 edition of the journal Math Horizons. The new blog post, and even better, the published paper, do a better job of focussing on the main point. Enjoy!

Posted in Uncategorized | 2 Comments

On the value of believing that you know the answer

A beautiful inequality, the arithmetic -mean-geometric-mean inequality (AM-GM) seems reasonable but hard to prove. It states that the usual average of n positive real numbers is at least as big as the n-th root of the product. We’ll experiment with it, guess and believe a few things about it, and then actually use our confidence in those guesses to fashion a remarkably simple proof. Here’s an example:

The average of the three numbers 4, 5, 6 is clearly 5. A consequence of AM-GM is that the cube root of the product (4)(5)(6)=120 must be less than 5. Since the cube root of 120 is less than the 5 (i.e. the cube root of 125), we gain some confidence that it works. As a natural-born-devil’s-advocate, we try to change the example so as to give the geometric mean a better chance. Keeping the average at 5, the best choice for maximizing the geometric mean seems to be: 5, 5, 5. Thus, all the numbers are equal. In this extreme case, both the arithmetic and the geometric means must be the same. In fact this suggests a proof.

Keep the arithmetic average the same and make the geometric average as big as possible. We suspect that the only way to achieve this is to make all the numbers equal. Let’s just believe this and see what happens. Notice that “keeping the average at 5” really means keeping the sum at 15; also “maximizing the geometric mean” actually suggests maximizing the product. When we replaced “4, 5, 6” with “5, 5, 5” we really needed to compare the product (4)(6) with (5)(5). This is a consequence of the difference of squares formula

Difference of Squares: (x — a)(x + a) = x^2 — a^2

with x=5 and a=1. In this example we effectively only replace two numbers—the 4 and the 6—with their average—5 with another 5.

So let’s try a more challenging example—three numbers 4, 4 and 7. If we try the same trick, perhaps we replace the 4 and 7 with a couple of 5.5’s. This gets us a bigger product, just as in the above example, but we don’t seem to be getting closer to proving that having all 5’s yields the maximum product.

Here’s where the belief in the answer helps us: Instead of moving the 4 and the 7 to their average, we move the 4 to the average (namely 5) and move the 7 the same distance in the other direction (to 6).

There are many ways to see why the product (5)(6) should be bigger than the original (4)(7), but we defer to the difference of squares formula once again. This time let x=5.5 and compare what we get with a=1.5 versus a=0.5 to see exactly what’s happening. (Don’t skip over this step—it’s the heart of the matter).

Now we have three numbers 5, 4 and 6 with the same sum, but a bigger product. This triple should look familiar (our first example!)—we remember that the next step is to replace the 4 and 6 with a couple of 5’s.

The experienced mathematician now instantaneously fills in all the details of a proof, arguing that if there is any number other than the arithmetic average, then there must be one both less than the average and another one more than average. So if any of the numbers are other than average we can make the product bigger, and bigger and bigger…etc…

However the main idea can be understood (perhaps at a different level) by elementary school students, provided they are comfortable with the difference of squares formula.

Toward that end, let’s run through a little game on five numbers 8, 8, 9, 9, 10. The total is 44 and the average is 8.8. We repeatedly replace this list of five numbers with another list, always with total 44, and at each stage the product is bigger. Also at each stage, at least one of the numbers is replaced with the average. Thus:

8, 8, 9, 9, 10—>8.8, 8, 9, 9, 9.2

We replaced the first 8 and the last 10. Now the product is bigger—why? What choice of x? What choices for a?

Keep going, replacing the two bold numbers by moving the closer one to the average 8.8 and moving the other one the same amount in the other direction:

8.8, 8, 9, 9, 9.2—>8.8, 8.4, 9, 9, 8.8

8.8, 8.4, 9, 9, 8.8—>8.8, 8.6, 8.8, 9, 8.8

8.8, 8.6, 8.8, 9, 8.8—>8.8, 8.8, 8.8, 8.8, 8.8

It’s just crazy! If any of the numbers are other than average, we can compare it to the situation where all the numbers are average, in a small number of steps. It’s slick, it’s fun and it works.

Remarks:

1. We just had our last Putnam session at Norwich this semester. We solved a problem using the AM-GM and just for fun tried to prove AM-GM. This post came out of that discussion. Thanks to Wise D. and Jessica F. for their role in this discussion! I asked Prof. Rob P. afterwards if he knew of it and he directed me towards this article:

An inductive proof of the AM-GM inequality, by Kong-Ming Chong, American Mathematical Monthly, Vol. 83, No. 5 (May, 1976), p. 369.

As is often the case with mathematical expositions, the problem is that you need to be something of an expert in mathematics to even realize that the article is saying the same thing as our exposition in this post! So we hope you enjoyed it and can do something like this with your students.

2. Originally in our Putnam group, we used calculus, instead of the difference of squares formula. We considered two positive numbers with a fixed sum of S, trying to maximize the product by using the function f(x)=x(S-x). The breakthrough came when we realized that it’s not just about maximizing f, but rather knowing the intervals of increase, since our “belief” in this post led us to increasing f rather than maximizing it. In this discussion I replace calculus with the most elementary idea possible, the difference of squares, to make it as clear, accessible and fun as possible.

3. “Fix the arithmetic average and make the geometric average as big as possible” means the same thing as “Keep the arithmetic average the same and make the geometric average as big as possible.” At the last minute I opted for the latter phrasing.

On the unreasonable efficiency of mathematical writing, part II

The writing of mathematics is very efficient. Perhaps the teaching of mathematics can afford a little more redundancy. Here is a key example:

Mathematical induction-the champion of efficiency.

The beautiful idea:

Induction is often used (and usually taught) to prove formulas that hold for all non-negative integers. For example, the sum of the first n non-negative powers of 2, is equal to one less than the next power of 2. If n=4 it looks like this:

1+2+4+8=16-1

Our next instructions are to add 16 to both sides:

1+2+4+8+16=(16+16)-1

and we see that the pattern continues!

Let’s do it one more time. Start with our newly discovered equation:

1+2+4+8+16=32-1

add 32 to both sides, and voila:

1+2+4+8+16+32=64-1.

It’s pretty neat, and also really clear from this kind of discussion…unless of course…

Efficiency:

Unless of course, our discussion only uses one equation, which corresponds to n=1, the so-called base-case, and you write:

1=2-1.

The base case is just too small to see what’s going on or to notice any patterns. Your students may even feel stupid writing this down, as if it’s a secret handshake and they really don’t know why it’s necessary.

You guessed it; this is the way it’s almost always taught. But wait!–it gets worse; the instructions for seeing how the “pattern” continues are too cryptic for many beginners to follow. They are hidden in the formal details of the so-called inductive step. So it’s a pedagogical lose-lose.

It’s done this way because the base case is the smallest example that will make the proof work both completely and rigorously. After all, a full proof would have to include the case where n=1, and a formal proof works just fine with that as our main example. So let’s be efficient and … actually let’s not.

Focus on the beautiful idea

The beautiful idea behind the principle of mathematical induction should be given priority. That’s the first part of this post, and the first thing to be communicated regarding mathematical induction. Formal proof-writing etiquette can come a little later at the university level, or not at all in elementary school.
(Yes—I’m suggesting that elementary teachers should introduce math induction!)

So let’s use n=4 and n=5 for our examples, then look at the inductive step, keeping in mind how they relate to these examples, and then rewrite the proof the next day.

In other words, let’s give up some of the efficiency requirement.

On the unreasonable efficiency of mathematical writing

In literary writing, we are taught to provide an introduction, a body and a conclusion. We are encouraged to—inefficiently—relate parts of an essay to a central idea. In mathematics, our writing is at times too efficient. Perhaps the teacher can allow the student to be a little less efficient. Here is an example.

A little calculation

“10 is the greatest common divisor, or greatest common factor, of the two numbers 130 and 50.” We may abbreviate this as:

gcd(130,50)=10.

We see this by inspection, by factoring our numbers. There is a harder way to find the answer, called the “Euclidean Algorithm.” It turns out that this method will usually be much easier for large numbers where factoring is difficult. Let us use this example to introduce the Euclidean Algorithm. Consider this little calculation:

130=2(50)+30 … gcd(130,50)=gcd(50,30) (why?)

50=(30)+20 … gcd(50,30)=gcd(30,20) (why?)

30=(20)+10 … gcd(30,20)=gcd(20,10) (why?)

and finally, gcd(20,10)=10, since 10 divides 20.

What is the main pedagocial point?

The question “why?” appears three times in the little calculation. If we’re clever, we can answer all three questions all at the same time, with a little lemma—and we should do this (if you don’t know the answer, try it!). The answer constitutes the central idea behind the Euclidean Algorithm. The pedagogical problem is that the required lemma is usually discussed first; so that the chain of equalities: gcd(130,50)=gcd(50,30)=gcd(30,20)=gcd(20,10) can be suppressed!

Conclusion

My suggestion to the teacher is to introduce the lemma only after a few examples have been worked by the student in this inefficient way. Thus, the lemma is an eventual recognition of a pattern—a celebration of just how useful it appears to be, rather than an efficient shortcut in order to avoid it.

On Teaching Loving Mathematics

I love math—not because I think I’m good at it, but because it’s powerful, it’s supremely beautiful and it’s everywhere. Mathematicians and many others agree! However, it happens all too often that people claim not to agree. Teachers can change this, but how? How does one teach loving mathematics? In this post, I mention just a few techniques to consider, based on very positive experiences.

Last semester I ran a mathematical problem solving course. I did a few specific things that are not usually done in a mathematics class. Several small assignments consisted of answering a single question, which the student chose from a list of five. Students therefore would read more than one question before choosing a question to answer. This was my first victory—thinking about several mathematical questions without necessarily writing about them or handing them in is already part of the math-loving process!

This brings us to the next trick—participation grade. When I was a student, any such notion was offensive. It usually meant the extroverts would get more points, or even worse, that teacher preference would play a role in the final grade. I corrected all of this by giving a mathematically precise definition of participation, with several options for obtaining the points. One option was to do five extra questions beyond the minimum at a convenient time during the course. So if students read questions and liked them, they could get a little credit. Another option for participation points came from short presentations of problems to the class. Since these points were for participation only, students lacking in confidence, who were normally too afraid to present, would still do so. Some of these presentations were admittedly not very good—but here’s the key—they got much better. The motivation for doing a good job changed from getting a good grade, to wanting to do a better job, and ultimately to sharing the enjoyment of having done the work. Classmates were quick to point out ways to improve these participation activities, since they knew that it was all about the process-the participation-and not about the presentation’s grade. I certainly don’t claim that all courses should be like this; but I am claiming that not enough courses are like this. Try it!

Extra Remarks: Congratulations to my Ph.D. supervisor, Dr. Jan Minac:
Fortunately there are some classes where love of mathematics can’t help but be taught, just because of the instructor. As a student, the only time I ever remember experiencing extensive choices for homework questions was at Western University, with Dr. Jan Minac as the instructor. So I was in no way surprised to learn that he just won a prestigious national teaching award! From the CMS media release: “Jan Minac shares personal stories of great mathematicians with his students, so that the students become so involved…” How involved? Click here for the full story. Congratulations Jan!

On the Value of Making up Your Own Question

First, we consider an example of how one might make up a question.
Here is a nice fact:

(1/2)+(1/3)+(1/6)=1

Let’s have some fun with this fact by making it the answer of a question. Think of it as a mystery equation, and it’s our job to point the problem-solver in the right direction to find it. Hmm…

The sum of these 3 denominators is 11. After a bit of thought we see that {2,3,6} is the only set of 3 positive integers with sum 11, such that the sum of reciprocals is equal to 1. So we will tell the solver that we have x numbers whose sum is f(x) and we’ll pick a function f so that f(3)=11. We’ll also mention that the sum of reciprocals is exactly 1.

Now we need a hint so that the solver can be directed towards x=3.
Consider the more general situation where the x numbers are only required to be positive real numbers instead of positive integers. If the sum of reciprocals is 1, then it is well known that the sum f(x) is minimized if the x numbers are equal to each other, and thus equal to x (think of famous inequalities!). Hence x^2<=f(x).

Let’s pick f(x)=5x-4. Then x^2<=5x-4 and this places x between 1 and 4. We are ready to state the problem. Here’s a version of it, close to the way it appeared on the 2005 William Lowell Putnam Mathematical Competition:

Find all positive integers x, k_1, k_2, …, k_x, such that the sum k_1+k_2+…+k_x=5x-4 and such that (1/k_1)+…+(1/k_x)=1.

Our mystery equation corresponds to the case where x=3. There are a couple of other solutions, corresponding to x=1 and x=4.

To be clear, I did not make up this question. I also have no knowledge of what was going through the mind of the person who did make up this question. I was merely having fun with this beautiful problem!

My point is that this question, while incredibly difficult, may not seem so bad from the point of view of the problem-maker. It’s only difficult for the problem-solver. More importantly, I believe that the problem-solver has a much better chance of answering this type of question after having experience as a problem-maker. As much as possible, we want to be proactive rather than reactive.

So I propose that we practice making up good questions.
Good luck on the Putnam!

(A version of this post will be included in an up-coming book I’m working on, about mathematical problem-solving).

Squaring the Superhero

I teach many different courses in mathematics. All of these courses contain wonderful material. I love math. Last year, my son asked me why some of the students at his elementary school did not seem to share my view regarding the beauty of mathematics. So I started a math club at the elementary school, in an attempt to effect a positive change.

I decided to cover the same material covered in university courses, realizing that I would have to do it in a much different way. Here’s an example:

University version: If a rectangle has perimeter p, what is its maximum possible area?

Here’s what happened in the math club:

I asked them if they knew what a rectangle was, and how to find its perimeter and area. We discussed examples. It turns out they knew quite a bit. Then I asked each of them to produce their own rectangle, with perimeter 12 units. They each had a favorite! We calculated the area of each one. By the end of the meeting, we were convinced that the square provided the maximum area. By the end of the next meeting, we owned the difference of squares formula. In the third meeting, we used this formula to understand the formula for the area of a circle, partly by considering an appropriate annulus—this meeting was only a couple of days after a solar eclipse.

So what does this have to do with superheroes?

Special transformations, accompanied by superpowers, can really capture the imagination. It might be Clark Kent removing his glasses; but probably it’s a tadpole transforming into a frog or a caterpillar becoming a butterfly! In this video (link below), we introduce Rover the rectangle, Squeaky the square and Ellise the ellipse. Rover can change his shape as long as he stays rectangular and keeps the same perimeter—he catches circles by flattening himself! The intuitive connection between superpowers and transformations illustrates how a mathematician may think and feel about the subject. It is a human activity. It is fun, it is personal—it is mathematics!

So far we’ve made three of these videos. Each one represents portions of a different, hour-long meeting. Perhaps the best way to use them is to watch them with children, pause to discuss certain points, and then go off on a tangent. For example, in our first club meeting we eventually changed the fixed perimeter from 12 units to 14 units, which led to a wonderful discussion about fractions.

I tell the math club members how our discussions fit into the context of a university course. They LOVE this! At the end of our first meeting, we wrote down the formula for the maximum area “in terms of p” and they hurriedly copied it down as if it were a treasure. This was one of many pleasant surprises. Perhaps you are already doing something like this; if not, please try it. Become a superhero!

Note added: I was delighted to see that this video has been used by Bruce Ferrington, who keeps an excellent blog. Here is a recent post, “Same Perimeter, Different Area.”

On Trigonometric Nostalgia

When my son asks me how something works, I often don’t have a satisfactory answer. After all, I can’t rip open my digital device and show him the parts in action. In sharp contrast, we can see exactly how a bicycle works—we have an open window to its technical soul.

The problem is that fewer and fewer of our everyday tools have these open windows, and this really can’t be good for any of us. So when it comes to teaching, I prefer examples that come with an open window. I think this is a big part of what students want when they ask, “What is this good for?” After all, students master video games, without any direct application in mind; but they can see the entire picture.

I remember one day way back in the 1980’s walking through a field that probably no longer exists. I realized that I could calculate the height of the tree that was 30 feet from me, without climbing it. I did the calculation, even though no one asked me to do so. My enthusiasm was a result of understanding the whole point—I could gain information about a large triangle just by knowing things about right triangles with a hypotenuse of unit length. To me, this is trigonometry.

In this 3 minute video, I attempt to capture the excitement I had walking through the field. In the video, there is no field, but there is a clear story. For teachers, there will be a lot more to discuss after your students are done watching it. I hope you enjoy it!