Calculus Better Explained

C ONTENTS
Contents
1 1 Minute Calculus: X-Ray and Time-Lapse Vision
2 Practice X-Ray and Time-Lapse Vision
3 Expanding Our Intuition
11
4 Learning The Official Terms
14
5 Music From The Machine
21
6 Improving Arithmetic And Algebra
25
7 Seeing How Lines Work
29
8 Playing With Squares
35
9 Working With Infinity
40
10 The Theory Of Derivatives
43
11 The Fundamental Theorem Of Calculus (FTOC)
47
12 The Basic Arithmetic Of Calculus
51
13 Patterns In The Rules
57
14 The Fancy Arithmetic Of Calculus
61
15 Discovering Archimedes Formulas
67
CHAPTER
1 M INUTE C ALCULUS : X-R AY AND

T IME -L APSE V ISION
We usually take shapes, formulas, and situations at face value. Calculus gives
us two superpowers to dig deeper:
X-Ray Vision: You see the hidden pieces inside a pattern. You dont just
see the tree, you know its made of rings, with another growing as we
speak.
Time-Lapse Vision: You see the future path of an object laid out before
you (cool, right?). Hey, theres the moon. For the next few days itll be
white, but on the sixth itll be low in the sky, in a color I like. Ill take a
photo then.
CHAPTER 1. 1 MINUTE CALCULUS: X-RAY AND TIME-LAPSE VISION
So why is Calculus useful? Well, just imagine having X-Ray or Time-Lapse

vision to use at will. That object over there, how was it put together? What
will happen to it?
(Strangely, my letters to Marvel about Calculus-man have been ignored to
date.)
1.1
Calculus In 10 Minutes: See Patterns Step-By-Step
What do X-Ray and Time-Lapse vision have in common? They examine patterns step-by-step. An X-Ray shows the individual slices inside, and a timelapse puts each future state next to the other.
This seems pretty abstract. Lets look at a few famous patterns:
We have a vague feeling these formulas are connected, right?
Lets turn our X-Ray vision and see where this leads. Suppose we know the
equation for circumference (2r ) and want to figure out the equation for area.
What can we do?
This is a tough question. Squares are easy to measure, but what can we do
with an ever-curving shape?
Calculus to the rescue. Lets use our X-Ray vision to realize a disc is really
just a bunch of rings put together. Similar to a tree trunk, heres a step-bystep view of a filled-in circle:
Why does this viewpoint help? Well, lets unroll those curled-up rings so
theyre easier to measure:
Whoa! We have a bunch of straightened rings that form a triangle, which

is much easier to measure (Wikipedia has an animation).
The height of the largest ring is the full circumference (2r ), and each ring
gets smaller. The height of each ring depends on its original distance from the
center; the ring 3 inches from the center has a height of 2 3 inches. The
smallest ring is a pinpoint, more or less, without any height at all.
And because triangles are easier to measure than circles, finding the area
isnt too much trouble. The area of the ring triangle = 12 base hei g ht =
1
2
2 r (2r ) = r , which is the formula for a circles area!
Our X-Ray vision revealed a simple, easy-to-measure structure within a
curvy shape. We realized a circle and a set of glued-together rings were really the same. From another perspective, a filled-in disc is the time lapse of a
ring that got larger and larger.
Remember learning arithmetic? We learned a few things to do with numbers (and equations): add/subtract, multiply/divide, and use exponents/roots.
Not a bad start.
Calculus gives us two new options: split apart and glue together. A key
epiphany of calculus is that our existing patterns can be seen as a bunch of
glued-together pieces. Its like staring at a building and knowing it was made
brick-by-brick.
1.2
So. . . What Can I Do With Calculus?
It depends. What can you do with arithmetic?

Technically, we dont need numbers. Our caveman ancestors did fine (well,
for some version of fine). But having an idea of quantity makes the world a
lot easier. You dont have a big and small pile of rocks: you have an exact
count. You dont need to describe your mood as good or great you can
explain your mood from 1 to 10.
Arithmetic provides universal metaphors that go beyond computation, and
once seen, theyre difficult to give up.
Calculus is similar: its a step-by-step view of the world. Do we need X-Ray
and Time Lapse vision all the time? Nope. But theyre nice perspectives to
turn on when we face a puzzling situation. What steps got us here? Whats
the next thing that will happen? Where does our future path lead?
There are specific rules to calculus, just like there are rules to arithmetic.
And theyre quite useful when youre cranking through an equation. But dont
forget there are general notions of quantity and step-by-step thinking that
we can bring everywhere.
Our primary goal is to feel what a calculus perspective is like (What Would
Archimedes Do?). Over time, well work up to using the specific rules ourselves.
CHAPTER
P RACTICE X-R AY
AND
T IME -L APSE V ISION
Calculus trains us to use X-Ray and Time-Lapse vision, such as re-arranging a

circle into a ring triangle (diagram). This makes finding the area. . . well, if
not exactly easy, much more manageable.
But we were a little presumptuous. Must every circle in the universe be
made from rings?
Heck no! Were more creative than that. Heres a few more options for our
X-Ray:
Now were talking. We can imagine a circle as a set of rings, pizza slices, or
vertical boards. Each underlying blueprint is a different step-by-step strategy
in action.
Imagine each strategy unfolding over time, using your time-lapse vision.
Have any ideas about what each approach is good for?
Ring-by-ring Analysis
Rings are the old standby. Whats neat about a ring-by-ring progression?
CHAPTER 2. PRACTICE X-RAY AND TIME-LAPSE VISION
Each intermediate stage is an entire mini circle on its own. i.e., when
were halfway done, we still have a circle, just one with half the regular
radius.
Each step is an increasing amount of work. Just imagine plowing a circular field and spreading the work over several days. On the first day,
you start at the center and dont even move. The next, you make take
the tightest turn you can. Then you start doing laps, larger and larger,
until you are circling the entire yard on the last day. (Note: The change
between each ring is the same; maybe its 1 extra minute each time you
make the ring larger).
The work is reasonably predictable, which may help planning. If we
know its an extra minute for each lap, then the 20th ring will take 20
minutes.
Most of the work happens in the final laps. In the first 25% of the timelapse, weve barely grown: were adding tiny rings. Near the end, we
start to pick up steam by adding long slices, each nearly the final size.
Now lets get practical: why is it that trees have a ring pattern inside?
Its the first property: a big tree must grow from a complete smaller tree.
With the ring-by-ring strategy, were always adding to a complete, fully-formed
circle. We arent trying to grow the left half of the tree and then work on the
right side.
In fact, many natural processes that grow (trees, bones, bubbles, etc.) take
this inside-out approach.
Slice-by-slice Analysis
Now think about a slice-by-slice progression. What do you notice?

We contribute the same amount with each step. Even better, the parts
are identical. This may not matter for math, but in the real world (aka
cutting a cake), we make the same action when cutting out each slice,
which is convenient.
Since the slices are symmetrical, we can use shortcuts like making cuts
across the entire shape to speed up the process. These assembly line
speedups work well for identical components.
Progress is extremely easy to measure. If we have 10 slices, then at slice
6 we are exactly 60% done (by both area and circumference).
We follow a sweeping circular path, never retracing our steps from an

angular point of view. When carving out the rings, we went through
the full 360-degrees on each step.
Time to think about the real world. What follows this slice-by-slice pattern,
and why?
Well food, for one. Cake, pizza, pie: we want everyone to have an equal
share. Slices are simple to cut, we get nice speedups (like cutting across the
cake), and its easy to see how much is remaining. (Imagine cutting circular
rings from a pie and trying to estimate how much area is left.)
Now think about radar scanners: they sweep out in a circular path, clearing a slice of sky before moving to another angle. This strategy does leave
a blind spot in the angle you havent yet covered, a tradeoff youre hopefully
aware of.
Contrast this to sonar used by a submarine or bat, which sends a sound
ring propagating in every direction. That works best for close targets (covering every direction at once). The drawback is that unfocused propagation gets
much weaker the further out you go, as the initial energy is spread out over a
larger ring. We use megaphones and antennas to focus our signals into beams
(thin slices) to get the max range for our energy.
Operationally, if were building circular shape from a set of slices (like a
paper fan), it helps to have every part be identical. Figure out the best way
to make a single slice, then mass produce them. Even better: if one slice can
collapse, the entire shape can fold up!
Board-by-board Analysis
Getting the hang of X-Rays and Time-lapses? Great. Look at the progression
above, and spend a few seconds thinking of the pros and cons. Dont worry, Ill
wait.
Ready? Ok. Heres a few of my observations:
This is a very robotic pattern, moving left-to-right and never returning to
a previous horizontal position.
The contribution from each step starts small, gradually gets larger, maxes
out in the middle, and begins shrinking again.
Our progress is somewhat unpredictable. Sure, at the halfway mark
weve finished half the circle, but the pattern rises and falls which makes
it difficult to analyze. By contrast, the ring-by-ring pattern changed the
same amount each time, always increasing. It was clear that the later
rings would add the most work. Here, its the middle section which seems
to be doing the heavy lifting.
Ok, time to figure out where this pattern shows up in the real world.
Decks and wooden structures, for one. When putting down wooden planks,
we dont want to retrace our steps, or return to a previous position (especially
if there are other steps involved, like painting). Just like a tree needs a fullyformed circle at each step, a deck insists upon components found at Home
Depot (i.e., rectangular boards).
In fact, any process with a linear pipeline might use this approach: finish
a section and move onto the next. Think about a printer that has to spray a
pattern top-to-bottom as the paper is fed through (or these days, a 3d printer).
It doesnt have the luxury of a ring-by-ring or a slice-by-slice approach. It will
see a horizontal position only once, so it better make it count!
From a human motivation perspective, it may be convenient to start small,
work your way up, then ease back down. A pizza-slice approach could be
tolerable (identical progress every day), but rings could be demoralizing: every
step requires more than the one before, without yielding.
Getting Organized
So far, weve been using natural descriptions to explain our thought processing.
Take a bunch of rings or Cut the circle into pizza slices. This conveys a general notion, but its a bit like describing a song as Dum-de-dum-dum youre
pretty much the only one who knows whats happening. A little organization
can make it perfectly clear what we mean.
The first thing we can do is keep track of how were making our steps. I like
to imagine a little arrow in the direction we move as we take slices:
In my head, Im moving along the yellow line, calling out the steps like
Oprah giving away cars (you get a ring, you get a ring, you get a ring. . . ).
(Hey, its my analogy, dont give me that look!)
The arrow is handy, but its still tricky to see the exact progression of slices.
Why dont we explicitly line up the changes? As we saw before, we can unroll
the steps, put them side-by-side, and make them easier to compare:
The black arrow shows the trend. Pretty nice, right? We can tell, at a
glance, that the slices are increasing, and by the same amount each time (since
the trend line is straight).
Math fans and neurotics alike enjoy these organized layouts; there is something soothing about it, I suppose. And since youre here, we might as well
organize the other patterns too:
Now its much easier to compare each X-Ray strategy:

With rings, steps increase steadily (upward sloping line)
With slices, steps stay the same (flat line)
With boards, steps get larger, peak, then get smaller (up and down; note,
the curve looks elongated because the individual boards are lined up on
the bottom)
The charts made our comparisons easier, wouldnt you say? Sure. But wait,
isnt that trendline looking like a dreaded x-y graph?
Yep. Remember, a graph is a visual explanation that should help us. If its
confusing, it needs to be fixed.
Many classes present graphs, divorced from the phenomena that made
them, and hope you see an invisible sequence of steps buried inside. Its a
recipe for pain just be explicit about what a graph represents!
Archimedes did fine without x-y graphs, finding the area of a circle using
the ring-to-triangle method. In this primer well leave our level of graphing
to what you see above (the details of graphs will be a nice follow-up, after our
intuition is built).
So, are things starting to click a bit? Thinking better with X-Rays and Timelapses?
10
PS. It may bother you that our steps create a circle-like shape, but not a
real, smooth circle. Well get to that :). But to be fair, it must also bother you
that the square pixels on this screen make letter-like shapes, and not real,
smooth letters. And somehow, the letter-like pixels convey the same meaning
as the real thing!
Questions
1) Can you describe a grandma-friendly version of what youve learned?
2) Lets expand our thinking into the 3rd dimension. Can you of a few ways
to build a sphere? (No formulas, just descriptions)
Ill share a few approaches with you in the next lesson. Happy math.
CHAPTER
E XPANDING O UR I NTUITION
Hope you thought about the question from last time: how do we take our X-Ray
strategies into the 3rd dimension?
Heres my take:
Rings become shells, a thick candy coating on a delicious gobstopper.

Each layer is slightly bigger than the one before.
Slices become wedges, identical sections like slices of an orange.
Boards become plates, thick discs which can be stacked together. (I sometimes daydream of opening a bed & breakfast that only serves spherical
stacks of pancakes. And they say math has no real-world uses!)
The 3d steps can be seen as the 2d versions, swept out in various ways. For
example, spin the individual rings (like a coin) to create a shell. Imagine slices
pushed through a changing mold to make wedges. Lastly, imagine we spin the
boards to make plates, like carving a wooden sphere with a lathe (video).
The tradeoffs are similar to the 2d versions:
11
CHAPTER 3. EXPANDING OUR INTUITION
12
Organic processes grow in layers (pearls in an oyster)

Fair divisions require wedges (cutting an apple for friends)
The robotic plate approach seems easy to manufacture
An orange is an interesting hybrid: from the outside, it appears to be made
from shells, growing over time. And inside, it forms a symmetric internal structure a better way to distribute seeds, right? We could analyze it both ways.
Exploring The 3d Perspective

In the first lesson we had the vague notion the circle/sphere formulas were
related:
Well, now we have an idea how:
Circumference: Start with a single ring

Area: Make a filled-in disc with a ring-by-ring time lapse
Volume: Make the circle into a plate, and do a plate-by-plate time lapse
to build a sphere
Surface area: X-Ray the sphere into a bunch of shells; the outer shell is
the surface area
CHAPTER 3. EXPANDING OUR INTUITION
13
Wow! These descriptions are pretty detailed. We know, intuitively, how to

morph shapes into alternate versions by thinking time-lapse this or X-Ray
that. We can move backwards, from a sphere back to circumference, or try
different strategies: maybe we want to split the circle into boards, not rings.
The Need For Math Notation

You might have noticed its getting harder to explain your ideas. Were reaching
for physical analogies (rings, boards, wedges) to explain our plans: Ok, take
that circular area, and try to make some discs out of it. Yeah, like that. Now
line them up into the shape of a sphere. . . .
I love diagrams and analogies, but should they be required to explain an
idea? Probably not.
Take a look how numbers developed. At first, we used very literal symbols
for counting: I, II, III, and so on. Eventually, we realized a symbol like V could
take the place of IIIII, and even better, that every digit could have its own
symbol (we do keep our metaphorical history with the number 1).
This math abstraction helped in a few ways:
Its shorter. Isnt 2 + 3 = 5 better than two added to three is equal
to five? Fun fact: In 1557, Robert Recorde invented the equals sign,
written with two parallel lines (=), because noe 2 thynges, can be moare
equalle. (I agrye!)
The rules do the work for us. With Roman numerals, were essentially
recreating numbers by hand (why should VIII take so much effort to write
compared to I? Oh, because 8 is larger than 1? Not a good reason!). Decimals help us do the work of expressing numbers, and make them easy
to manipulate. So far, weve been doing the work of calculus ourselves:
cutting a circle into rings, realizing we can unroll them, looking up the
equation for area and measuring the resulting triangle. Couldnt the rules
help us here? You bet. We just need to figure them out.
We generalized our thinking. 2 + 3 = 5 is really twoness + threeness = fiveness. It sounds weird, but we have an abstract quantity (not
people, or money, or cows. . . just twoness) and we see how its related
to other quantities. The rules of arithmetic are abstract, and its our job
to apply them to a specific scenario.
Multiplication started as a way to count groups and measure rectangular
area. But when you write $15/hour 2.5 hours = $37.50 you probably
arent thinking of groups of hours or getting the area of a wage-hour rectangle. Youre just applying arithmetic to the concepts.
In the upcoming lessons well learn the official language to help us communicate our ideas and work out the rules ourselves. And once weve internalized
the rules of calculus, we can explore patterns, whether they came from geometric shapes, business plans, or scientific theories.
CHAPTER
L EARNING T HE O FFICIAL T ERMS

Weve been able to describe our step-by-step process with analogies (X-Rays,
Time-lapses, and rings) and diagrams:
However, this is a very elaborate way to communicate. Heres the Official

Math terms:
Lets walk through the fancy names.
14
CHAPTER 4. LEARNING THE OFFICIAL TERMS
15
The Derivative
The derivative is splitting a shape into sections as we move along a path (i.e.,
X-Raying it). Now heres the trick: although the derivative generates the entire
sequence of sections (the black line), we can also extract a single one.
Think about a function like f (x) = x 2 . Its a curve that describes a giant list
of possibilities (1, 4, 9, 16, 25, etc.). We can graph the entire curve, sure, or
examine the value of f(x) *at* a specific value, like x = 3.
The derivative is similar. Officially, its the entire pattern of sections, but we
can zero into a specific one by asking for the derivative at a certain value. (The
derivative is a function, just like f (x) = x 2 ; if not otherwise specified, were
describing the entire function.)
What do we need to find the derivative? The shape to split apart, and the
path to follow as we cut it up (the orange arrow). For example:
The derivative of a circle with respect to the radius creates rings
The derivative of a circle with respect to the perimeter creates slices
The derivative of a circle with respect to the x-axis creates boards
I agree that with respect to sounds formal: Honorable Grand Poombah
radius, it is with respect to you that we derive. Math is a gentlemans game, I
suppose.
Taking the derivative is also called differentiating, because we are finding
the difference between successive positions as a shape grows. (As we grow the
radius of a circle, the difference between the current disc and the next size up
is that outer ring.)
The Integral, Arrows, and Slices

The integral is glueing together (time-lapsing) a group of sections and measuring the final result. For example, we glued together the rings (into a ring
triangle) and saw it accumulated to r 2 , aka the area of a circle.
Heres what we need to find the integral:
Which direction are we gluing the steps together? Along the orange
line (the radius, in this case)
When do we start and stop? At the start and end of the arrow (we start
at 0, no radius, and move to r, the full radius)
How big is each step? Well. . . each item is a ring. Isnt that enough?
Nope! We need to be specific. Weve been saying we cut a circle into rings
or pizza slices or boards. But thats not specific enough; its like a BBQ
recipe that says Cook meat. Flavor to taste.
Maybe an expert knows what to do, but we need more specifics. How large,
exactly, is each step (technically called the integrand)?
16
Ah. A few notes about the variables:

If we are moving along the radius r , then d r is the little chunk of radius
in the current step
The height of the ring is the circumference, or 2r
Theres several gotchas to keep in mind.
First, d r is its own variable, and not d times r. It represents the tiny
section of the radius present in the current step. This symbol (d r , d x , etc.)
is often separated from the integrand by just a space, and its assumed to be
multiplied (written 2r d r ).
Next, if r is the only Rvariable used in the integral, then d r is assumed
to
R
be there. So if you see 2r this still implies were doing the full 2r d r .
(Again, if there are two variables involved, like radius and perimeter, you need
to clarify which step were using: d r or d p ?)
Last, remember that r (the radius) changes as we time-lapse, starting at 0
and eventually reaching its final value. When we see r in the context of a step,
it means the size of the radius at the current step and not the final value it
may ultimately have.
These issues are extremely confusing. Id prefer we use rdr to indicate an
intermediate r at the current step instead of a general-purpose r thats easily
confused with the max value of the radius. I cant change the symbols at this
point, unfortunately.
Practicing The Lingo

Lets learn to talk like calculus natives. Heres how we can describe our X-Ray
strategies:
17
Remember, the derivative just splits the shape into (hopefully) easy-tomeasure steps, such as rings of size 2r d r . We broke apart our lego set and
have pieces scattered on the floor. We still need an integral to glue the parts
together and measure the new size. The two commands are a tag team:
The derivative says: Ok, I split the shape apart for you. It looks like a
bunch of pieces 2r tall and d r wide.
The integral says: Oh, those pieces resemble a triangle I can measure
that! The total area of that triangle is 12 base hei g ht , which works out to
r 2 in this case..
Heres how wed write the integrals to measure the steps weve made:
18
A few notes:
Often, we write an integrand as an unspecified pizza slice or board
(use a formal-sounding name like s(p) or b(x) if you like). First, we setup
the integral, and then we worry about the exact formula for a board or
slice.
Because each integral represents slices from our original circle, we know
they will be the same. Gluing any set of slices should always return the
total area, right?
The integral is often described as the area under the curve. Its accurate,
but shortsighted. Yes, we are gluing together the rectangular slices under
the curve. But this completely overlooks the preceding X-Ray and TimeLapse thinking. Why are we dealing with a set of slices vs. a curve in the
first place? Most likely, because those slices are easier than analyzing the
shape itself (how do you directly measure a circle?).
Questions
1) Can you think of another activity which is made simpler by shortcuts and
notation, vs. written English?
2) Interested in performance? Lets drive the calculus car, even if you cant
build it yet.
Question 1: How would you write the integrals that cover half of a circle?
19
Each should would be similar to:

integrate [size of step] from [start] to [end] with respect to [path
variable]
(Answer for the first half and the second half. This links to Wolfram Alpha,
an online calculator, and well learn to use it later on.)
Question 2: Can you find the complete way to describe our pizza-slice
approach?
The math command should be something like this:

integrate [size of step] from [start] to [end] with respect to [path
variable]
Remember that each slice is basically a triangle (so whats the area?). The
slices move around the perimeter (where does it start and stop?). Have a guess
for the command? Here it is, the slice-by-slice description.
Question 3: Can you figure out how to move from volume to surface area?
20
Assume we know the volume of a sphere is 4/3 * pi * r3. Think about

the instructions to separate that volume into a sequence of shells. Which variable are we moving through?
derive [equation] with respect to [path variable]
Have a guess? Great. Heres the command to turn volume into surface area.
CHAPTER
M USIC F ROM T HE M ACHINE

In the previous lessons weve gradually sharpened our intuition:
Appreciation: I think its possible to split up a circle to measure its area
Natural Description: Split the circle into rings from the center outwards,
like so:
Formal Description: integrate 2 * pi * r * dr from r=0 to r=r

Performance: (Sigh) I guess Ill have to start measuring the area. . .
Wait! Our formal description is precise enough that a computer can do the
work for us:
21
CHAPTER 5. MUSIC FROM THE MACHINE
22
Whoa! We described our thoughts well enough that a computer did the
legwork.
We didnt need to manually unroll the rings, draw the triangle, and find
the area (which isnt overly tough in this case, but could have been). We saw
what the steps would be, wrote them down, and fed them to a computer:
boomshakala, we have the result. (Just worry about the definite integral
portion for now.)
Now, how about deriviatives, X-Raying a pattern into steps? Well, we can
ask for that too:
Similar to above, the computer X-Rayed the formula for area and split it
step-by-step as it moved. The result is 2r , the height of the ring at every
position.
Seeing The Language In Action

Wolfram Alpha is an easy-to-use tool: the general format for calculus questions
is
integrate [equation] from [variable=start] to [variable=end]
derive [equation] with respect to [variable]
Thats a little wordy. These shortcuts are closer to the math symbols:
\int [equation] dr - integrate equation (by default, assume we go
from r = 0 to r = r , the max value)
d/dr equation - derive equation with respect to r
Theres shortcuts for exponents (32 = 9), multiplication (3 * r), and
roots (sqrt(9) = 3)
Now that we have the machine handy, lets try a few of the results weve
seen so far:
23
Click the formal description to see the computer crunch the numbers. As
you might have expected, they all result in the familiar equation for area. A
few notes:
The size of the wedge is 12 base hei g ht . The base is d p (the tiny section
of perimeter) and the height is r , the distance from the perimeter back
to the center.
The size of board is tricky. In terms of x & y coordinates, we have x 2 +y 2 =
r 2 , by the Pythagorean Theorem:
We solve for the height to get y = r 2 x 2 . We actually need 2 copies of

height, because y is the positive distance above the axis, and the board
extends above and below. The boards are harder to work with, and its
not just you: Wolfram Alpha takes longer to compute this integral than
the others!
The approach so far has been to immerse you in calculus thinking, and
gradually introduce the notation. Some of it may be a whirl which is completely expected. Youre sitting at a cafe, overhearing conversation in a foreign
language.
24
Now that you have the sound in your head, well begin to explore the details
piece-by-piece.
CHAPTER
I MPROVING A RITHMETIC A ND A LGEBRA

Weve intuitively seen how calculus dissects problems with a step-by-step viewpoint. Now that we have the official symbols, lets see how to bring arithmetic
and algebra to the next level.
Better Multiplication And Division

Multiplication makes addition easier. Instead of grinding through questions
like 2 + 2 + 2 + 2 + 2 + 2 + 2 + 2 + 2 + 2 + 2 + 2 + 2, we can rewrite it as: 2 13.
Boomshakala. If you wanted 13 copies of a number, just write it like that!
Multiplication makes repeated addition easier (likewise for division and
subtraction). But theres a big limitation: we must use identical, average-sized
pieces.
Whats 2 13? Its 13 copies of the same element.
Whats 100 / 5? Its 100 split into 5 equal parts.
Identical parts are fine for textbook scenarios (Drive an unwavering 30mph
for exactly 3 hours). The real world isnt so smooth. Calculus lets us accumulate or separate shapes according to their actual, not average, amount:
Derivatives are better division that splits a shape along a path (into
possibly different-sized slices)
Integrals are better multiplication that accumulates a sequence of steps
(which could be different sizes)
Operation
Division
Differentiation
Multiplication
Integration
Example
y
x
d
y
dx
y x
R
y dx
Notes
Split whole into identical parts
Split whole into (possibly different) parts
Accumulate identical steps
Accumulate (possibly different) steps
25
CHAPTER 6. IMPROVING ARITHMETIC AND ALGEBRA
26
Lets analyze our circle-to-ring example again. How does arithmetic/algebra

compare to calculus?
Division spits back the averaged-sized ring in our pattern. The derivative
gives a formula (2r ) that describes every ring (just plug in r). Similarly, multiplication lets us scale up the average element (once weve found it) into the
full amount. Integrals let us add up the pattern directly.
Sometimes we want to use the average item, not the fancy calculus steps,
because its a simpler representation of the whole (Whats the average transaction size? I dont need the full list). Thats fine, as long as its a conscious
choice.
Better Formulas
If calculus provides better, more-specific version of multiplication and division,
shouldnt we rewrite formulas with it? You bet.
An equation like d i st ance = speed t i me explains how toR find total distance
assuming an average speed. An equation like d i st ance = speed d t tells us
how to find total distance by breaking time into instants (split along the t
axis), and accumulating the (potentially unique) distance traveled each instant
(speed d t ).
Algebra
Calculus
d i st ance = speed t i me
R
d i st ance = speed d t
speed =
d i st ance
t i me
ar ea = hei g ht wi d t h
wei g ht = d ensi t y l eng t h wi d t h hei g ht
27
d
speed =
d i st ance
dt
R
ar ea = hei g ht d w
wei g ht =
d ensi t y d x d y d z
Similarly, speed = ddt d i st ance explains that we can split our trajectory into
time segments, and the (potentially unique) amount we moved in that time
slice was the speed.
The overused integrals are area under the curve explanation becomes
more clear. Multiplication, because it deals with static quantities, can only
measure the area of rectangles. Integrals let measurements curve and undulate
as we go: well add their contribution, regardless.
A series of multiplications becomes a series of integrals (called a triple integral). Its beyond this primer, but your suspicion was correct: we can mimic
the multiplications and integrate several times in a row.
Math, and specifically calculus, is the language of science because it describes relationships extremely well. When I see a formula with an integral or
derivative, I mentally convert it to multiplication or division (with the understanding that will give the average element, not the actual one).
Better Algebra
Algebra lets us start with one fact and systematically work out others. Imagine
I want to know the area of an unknown square. I cant measure the area, but I
overhead someone saying it was 13.3 inches on a side.
Algebra
Thinking process
Ar ea o f squar e =?
p
Ar ea = 13.3
p
2
Ar ea = (13.3)2
The area of this square is unknown. . .
Square both sides. . .
Ar ea = 176.89
. . . and I can recreate the original area
. . . but I know the square root.
Remember learning that along with add/subtract/multiply/divide, we could

take powers and roots? We added two new ways to transform an equation.
Well, calculus extends algebra with two more operations: integrals and
derivatives. Now we can work out the area of a circle, algebra-style:
Algebra + Calculus
Thinking process
Ar ea o f ci r cl e =?
The area of a circle is unknown. . .
d
Ar ea = 2r
dr
R d
R
Ar ea = 2r
dr
Ar ea = r 2
28
. . . but I know it splits into rings (along the radius)

Integrate both sides. . .
. . . and I can recreate the original area
The abbreviated notation helps see the big picture. If the integrand only
uses a single variable (as in 2r ), we can assume were using d r from r = 0
to r = r . This helps think of integrals and derivatives like squares and square
roots: operations that cancel!
Its pretty neat: gluing together and splitting apart should behave like
opposites, right?
R
With our simpler
notation, we can write ddr Ar ea = Ar ea instead of the
bulky
Rr
0
d
dr
Ar ea d r = Ar ea .
Learning The Rules

With arithmetic, we learned special techniques for combining whole numbers,
decimals,
fractions,
and roots/powers. Even though 3+9 = 12, we cant assume
p
p
p
3 + 9 = 12.
Similarly, we need to learn the rules for how integrals/derivatives work
when added, multiplied, and so on. Yes, there are fancy rules for special categories (what to do with ex, natural log, sine, cosine, etc.), but Im not concerned with that. Lets get extremely comfortable with the basics. The fancy
stuff can wait.
CHAPTER
S EEING H OW L INES W ORK

Lets start by analyzing a fairly simple pattern, a line:
f (x) = 4x
In everyday terms, we enter an input, x , and get an output, f (x). And in

this case, the output is 4 times the input. Suppose were buying fencing. For
every foot we ask for (the input, x ), it costs us $4 (the output, f (x)). 3 feet of
fence would cost $12. Fair enough.
Notice the abstract formula f (x) = 4x only considers quantities, not units.
We could have written that a foot of fence costs 400 pennies, f (x) = 400x , and
its up to us to remember its dollars in the first formula and pennies in the
second. Our equations use abstract, unitless quantities, and we have to bring
the results back to the real world.
Finding the Derivative Of A Line

The derivative of a pattern, ddx f (x), is sequence of steps generated as make
slices along an input variable (x is the natural choice here). How do we figure
out the sequence of steps?
Well, I imagine going to Home Depot and pestering the clerk:
You: Id like some lumber please. What will it run me? Clerk: How much
do you want? You: Um. . . 1 feet, I think. Clerk: Thatll be $4. Anything else
I can help you with? You: Actually, it might be 2 feet. Clerk: Thatll be $8.
Anything else I can help you with? You: It might be 3 feet. Clerk: (sigh) Thatll
be $12. Anything else I can help you with? You: How about 4 feet?
We have a relationship ( f (x) = 4x ) and investigate it by changing the input
a tiny bit. We see if theres a change in output (there is!), then we change the
input again, and so on.
In this case, its clear that an additional foot of fencing raises the cost by
$4. So weve just determined the derivative to be a constant 4, right?
Not so fast. Sure, we thought about the process and worked it out, but lets
be a little more organized (not every pattern is so simple). Can we describe
our steps?
1: Get the current output, f (x). In our case, f (1) = 4.
29
CHAPTER 7. SEEING HOW LINES WORK
30
2: Step forward by d x (1 foot, for example)

3: Find the new amount, f (x + d x). In our case, its f (1 + 1) = f (2) = 8.
4: Compute the difference: f (x + d x) f (x), or 8 - 4 = 4
Ah! The difference between the next step and the current one is the size of
our slice. For f (x) = 4x we have:
f (x + d x) f (x) = 4(x + d x) 4(x) = 4 d x
Increasing length by d x increases the cost by 4 d x .

That statement is true, but a little awkward: it talks about the total change.
Wouldnt it be better to have a ratio, such as cost per foot?
We can extract the ratio with a few shortcuts:
d x = change in our input
d f = resulting change in our output, f (x + d x) f (x)
df
d x = ratio of output change to input change
In our case, we have
df
4dx
=
=4
dx
dx
df
Notice how we express the derivative as d x instead of ddx f (x). What gives?
It turns out theres a few different versions we can use.
Think about the various ways we express multiplication:
Times symbol: 3 4 (used in elementary school)

Dot: 3 4 (used in middle school)
Implied multiplication with parentheses: (x + 4)(x + 3)
Implied multiplication with a space: 2r d r
The more subtle the symbol, the more we focus on the relationship between the quantities; the more visible the symbol, the more we focus on the
computation.
The notation for derivatives is similar:
Some versions, like f 0 (x), remind us the sequence of steps is a variation

df
of the original pattern. Notation like d x puts us into detail-oriented mode,
thinking about the ratio of output change relative to the input change (Whats
the cost per additional foot?).
31
Remember, the derivative is a complete description of all the steps, but it

can be evaluated at a certain point to find the step there: What is the additional
cost/foot when x = 3? In our case, the answer is 4.
Lets check our findings with the computer:
Nice! Linear relationships are simple enough that we dont need to visualize
the steps (but if you do, imagine a 1-d line being split into segments: - - -), and
we can work with the symbols directly.
Finding The Integral Of A Constant

Now lets work in the other direction: given the sequence of steps, can we find
the size of the original pattern?
In our fence-building scenario, its fairly straightforward. Solving
df
=4
dx
means answering What pattern has an output change of 4 times the input
change?
Well, weve just seen that f (x) = 4x results in f 0 (x) = 4. So, if were given
f 0 (x) = 4, we can guess the original function must have been f (x) = 4x .
Im pretty sure were right (what else could the integral of 4 be?) but lets
check with the computer:
32
Whoa theres two different answers (definite and indefinite). Why? Well,
theres many functions that could increase cost by $4/foot! Heres a few:
Cost = $4 per foot, or f (x) = 4x
Cost = $4 + $4 per foot, or f (x) = 4 + 4x
Cost = $10 + $4 per foot, or f (x) = 10 + 4x
There could be a fixed per-order fee, with the fence cost added in. All the
equation f 0 (x) = 4 says is that each additional foot of fencing is $4, but we dont
know the starting conditions.
The definite integral tracks the accumulation
of a set amount of slices.
R
The range can be numbers, such as 013 4, which measures the slices from
x=0 to x=13 (13 4 = 52). If the range includes a variable (0 to x), then
the accumulation will be an equation (4x ).
The indefinite integral finds the actual formula that created the pattern
of steps, not just
R the accumulation in that range. Its written with just an
integral sign: f (x). And as weve seen, the possibilities for the original
function should allow for a starting offset of C.
The notation for integrals can be fast-and-loose, and its confusing. Are we
looking for an accumulation, or the original function? Are we leaving out d x ?
These details are often omitted, so its important to feel whats happening.
The Secret: We Can Work Backwards

The little secret of integrals is that we dont need to solve them directly. Instead
of trying to glue slices together to find out their area, we just learn to recognize
the derivatives of functions weve already seen.
If weve know the derivative of 4x is 4, then if someone asks for the integral
of 4, we can respond with 4x (plus C, of course). Its like memorizing the
squares of numbers, not the square roots. When someone asks for the square
root of 121, dig through and remember that 11 11 = 121.
33
An analogy: Imagine an antiques dealer who knows the original vase just
from seeing a pile of shards.
How does he do it? Well, he takes replicas in the back room, drops them,
and looks at the pattern of pieces. Then he comes to your pile and says Oh, I
think this must be a Ming Dynasty Vase from the 3rd Emperor.
He doesnt try to glue your pile back together hes just seen that exact
vase break before, and your pile looks the same!
Now, there may be piles hes never seen, that are difficult or impossible to
recognize. In that case, the best we can do is to just add up the pieces (with a
computer, most likely). We might determine the original vase weighed 13.78
pounds. Thats a data point, fine, but its not as nice as knowing what the vase
was before it shattered.
This insight was never really explained to me: its painful to add up (possibly changing) steps directly, especially when the pattern gets complicated. So,
just learn to recognize the pattern from the derivatives weve already seen.
Getting To Better Multiplication

Gluing together equally-sized steps looks like regular multiplication, right? You
bet. If we wanted 3 steps (0 to 1, 1 to 2, 2 to 3) of size 2, we might write:
3
Z
0
2 dx = 6
Again, this is a fancy way of saying Accumulate 3 steps of size 2: what do

you get in total?. We are time-lapsing a sequence of equal changes.
Now, suppose someone asks you to add 2 + 2 + 2 + 2 + 2 + 2 + 2 + 2 +
2 + 2 + 2 + 2 + 2. You might say: Geez, cant you write it more simply? You
know, something like:
13
Z
0
2 d x = 26
Creating The Abstract Rules

Have an idea how linear functions behave? Great. We can make a few abstract
rules like working out the rules of algebra for ourselves.
If we know our output is a scaled version of our input ( f (x) = ax ) then
d
ax = a
dx
and
Z
a = ax +C
That is, the ratio of each output step to each input step is a constant a (4,
in our examples above). And now that weve broken the vase, we can work
backwards: if we accumulate steps of size a , they must have come from a
pattern similar to a x (plus C, of course).

R
34
Notice how I wrote a and not a d x I wanted to focus on a , and not

details like the width of the step (d x ). Part of calculus is learning to expose the
right amount of detail.
One last note: if our output does not react at all to our input (well charge
you a constant $2 no matter how much you buy. . . including nothing!) then
steps are a constant 0:
d
a =0
dx
In other words, there is no difference in the before-and-after measurement.

Now, a pattern may have an occasional zero slice, if it stands still for a moment.
Thats fine. But if every slice it zero, it means our pattern never changes.
There are a few subtleties down the road, but lets first learn Me want
food and then Verily, I hunger.
CHAPTER
P LAYING W ITH S QUARES

Weve seen how lines behave. Now lets hone our skills with a more complex
function like f (x) = x 2 .
This scenario is more complex, so lets visualize it and walk through the
properties.
Imagine youre building a square garden, to plant veggies and enjoy cucumbers in a few months. Youre not sure how large to make it. Too small,
and theres not enough food, but too large, and youll draw the attention of the
veggie mafia.
Youll build the garden incrementally, foot-by-foot, until it looks right. Lets
say you build it slowly, starting from scratch and growing to a 1010 plot:
What do we have? To the untrained eye, weve built a 1010 garden,

which uses 40 feet of perimeter fencing (10 4) and 100 square feet of topsoil
(10 10). (Pretend topsoil is sold by the square foot, assuming a standard
thickness.)
Bring On The Calculus

Thats it? The analysis is just the current perimeter and square footage? No
way.
By now, you should be clamoring to use X-Ray and Time-Lapse vision to see
whats happening under the hood. Why settle for a static description when we
can know the step-by-step description too?
We can analyze the behavior of the perimeter pretty easily:
35
CHAPTER 8. PLAYING WITH SQUARES
36
Per i met er = 4 x
d
Per i met er = 4
dx
The change in perimeter ( dd Px ) is a constant 4. For every 1-foot increase in

x, we have a 4-foot jump in the perimeter.
We can visualize this process. As the square grows, we push out the existing
sides, and just add 4 corner pieces (in yellow):
The visual is nice, but not required. After our exposure to lines, we should
glance at an equation like p = 4x and realize that p jumps by 4 when x jumps
by 1.
Changing Area
Now, how does area change? Since squares are fairly new, lets X-Ray the shape
as it grows:
We can write out the size of each jump, like so:

Now thats interesting. The gap from 02 to 12 is 1. The gap from 12 to 22
is 3. The gap from 22 to 32 is 5. And so on the odd numbers are sandwiched
between the squares! Whats going on?
37
Ah! Growing to the next-sized square means weve added a horizontal and
vertical strip (x + x) and a corner piece (1). If we currently have a square with
side x, the jump to the next square is 2x + 1. (If we have a 55 square, getting
to a 66 will be a jump of 25 + 1 = 11. And yep, 36 - 25 = 11.)
Again, the visualization was nice, but it took effort. Algebra can simplify
the process.
In this setup, if we set our change to dx = 1, we get
d f = f (x + 1) f (x) = (x + 1)2 x 2 = (x 2 + 2x + 1) x 2 = 2x + 1
Algebra predicted the size of the slices without a hitch.
Integrals and the Veggie Mafia

The derivative takes a shape and get the slices. Can we work backwards, from
the slices to the shape? Lets see.
Suppose the veggie mafia spies on your topsoil and fencing orders. They
cant see your garden directly, but what can they deduce from your purchases?
Lets say they observe a constant amount of fencing being delivered (4, 4,
4, 4. . . ) but increasing orders of topsoil (1, 3, 5, 7, 9, 11. . . ). What can they
work out?
A low-level goon might just add up the total amount accumulated (the
definite integral): Heya boss, looks like theyve built some garden with a total
perimeter of 40-feet, and total area of 100 square feet.
But thats not good enough! The goon doesnt know the shape youre trying
to build. He saw order after order go by without noticing the deeper pattern.
The crime boss is different: he wants the indefinite integral, the pattern you
are following. Hes savvy enough to track the pattern as the orders come in:
The area is increasing 1, 3, 5, 7. . . thats following a 2x + 1 area increase!
Now, there are likely many shapes that could grow their area by 2x +1. But,
combined with a constant perimeter increase of 4, he suspects youre making a
square garden after a few deliveries.
How does the godfather do it? Again, by working backwards. Hes split
apart enough shapes (triangles, squares, rectangles, etc.) that he has a large
table of before-and-afters, just like the antiques dealer.
When he sees a change of 2x+1, a square (x 2 ) is a strong candidate. Another
option might be a right triangle with sides x and 2x . Its area equation is 12 x2x =
x 2 , so the area would change the same as a square.
38
And when he sees a perimeter change of 4, he knows the perimeter must

be 4x . Ah! There arent too many shapes with both properties: a square is his
guess. Part of learning calculus is getting familiar with the typical origins of
the patterns you see, so you can quickly reverse-engineer them.
Suppose they see your orders change: your fencing deliveries drop to (2, 2,
2, 2. . . ) and your topsoil orders change to (20, 20, 20, 20). Whats going on?
Make a guess if you like.
Ready?
The veggie boss figures youve moved to a rectangular garden, with one side
determined by x, and the other side a fixed 20-feet. Does this guess work?
f (x) = 20x
Per i met er = 20 + 20 + x + x = 40 + 2x
Ar ea = 20x
d
Per i met er = 2
dx
d
Ar ea = 20
dx
Wow, it checks out. No wonder hes the godfather.

Lastly, what if the godfather saw topsoil orders of (5, 7, 9, 11, 13)? He
might assume youre still building a square (2x + 1 pattern), but you started
with a 22 garden already made. Your first jump was 5, which would have
happened if x was already 2 (2x + 1 = 5).
The mob boss is a master antiques dealer: he sees the pattern of shards
youre bringing in and quickly determines the original shape (indefinite integral). The henchman can only tell you the running totals so far (definite
integral).
Wrapping It All Up
It looks like were ready for another rule, to explain how squares change. If we
leave d x as it is, we can write:
d 2 f (x + d x) f (x) (x + d x)2 (x)2
x =
=
dx
dx
dx
=
x 2 + 2x d x + (d x)2 x 2 2x d x + (d x)2
=
= 2x + d x
dx
dx
Ok! Thats the abbreviated way of saying Grow by two sides and the corner. Lets plug this into the computer to check:
39
Uh oh! We hand-computed the derivative of x 2 as 2x + d x (which is usually

2x + 1), but the computer says its just 2x .
But isnt the difference from 42 to 52 exactly 25 - 16 = 9, and not 8? What
happened to that corner piece? The mystery continues in the next episode.
CHAPTER
W ORKING W ITH I NFINITY

Last time, we manually worked on the derivative of x 2 as 2x +1. But the official
derivative, according to the calculator, was 2x . What gives?
The answer relies on the concept of infinite accuracy. Infinity is a fascinating
and scary concept there are entire classes (Analysis) that study it. Well avoid
learning the nuances of every theory: our goal is a practical understanding
how infinity can help us work out the rules of calculus.
Insight: Sometimes Infinity Can Be Measured

Heres a quick brainteaser for you. Two friends are 10 miles apart, moving
towards each other at 5mph each. A mosquito files quickly between them,
touching one person, then the other, on and on, until the friends high-five and
the mosquito is squished.
Lets say the mosquito travels a zippy 20mph as it goes. Can you figure out
how far it flew before its demise?
Yikes. This one is tricky: once the mosquito leaves the first person, touches
the second, and turns around. . . the first person has moved closer! We have an
infinite number of ever-diminishing distances to add up. The question seems
painfully difficult to solve, right?
Well, how about this reasoning: from the perspective of the people walking,
theyre going to walk for an hour total. After all, they start 10 miles apart, and
the gap shrinks at 10 miles per hour (5mph + 5mph). Therefore, the mosquito
must be flying for an hour, and go 20 miles.
Whoa! Did we just find the outcome of a process with an infinite number
of steps? I think so!
40
CHAPTER 9. WORKING WITH INFINITY
41
Splitting A Whole Into Infinite Parts

Its time to turn our step-by-step thinking into overdrive. Can we think about
a finite shape being split into infinite parts?
In the beginning of the course, we saw a circle could be split into rings.
How many? Well, an infinite number!
A number line can be split into an infinite number of neighboring points.
How many decimals would you say there are between 1.0 and 2.0?
The path of a mosquito can be seen as a whole, or a journey subdivided
into an infinite number of segments
When we have two viewpoints (the mosquito, and the walkers), we can pick
the one thats easier to work with. In this case, the walkers holistic viewpoint
is simpler. With the circle, its easier to think about the rings themselves. Its
nice to have both options available.
Heres another example: can you divide a cake into 3 equal portions, by
only cutting into quarters:
Its a weird question. . . but possible! Cut a entire cake into quarters. Share
3 pieces and leave 1. Cut the remaining piece into quarters. Share 3 pieces,
leave 1. Keep repeating this process: at every step, everyone has received an
equal share, and the remaining cake will be split evenly as well. Wouldnt this
plan maintain an even split among 3 people?
Were seeing the intuition behind infinite X-Ray and Time-lapse vision: zooming in to turn a whole into an infinite sequence. At first, we might think dividing something into infinite parts requires each part to be nothing. But, thats
not right: the number line can be subdivided infinitely, yet theres a finite gap
between 1.0 and 2.0.
Two Fingers Pointing At The Same Moon

Why can we understand variations of the letter A, even when pixelated?
CHAPTER 9. WORKING WITH INFINITY
42
Even though the rendering is different, we see the idea being pointed to.
All three versions, from perfectly smooth to jagged, create the same letter A in
our heads (or, are you unable to read words when written out with rectangular
pixels?). An infinite sequence can point to the same result wed find if we took
it all at once.
In calculus, there are detailed rules about how to find what result an infinite
set of steps points to. And, there are certain sequences that cannot be worked
out. But, for this primer, well deal with functions that behave nicely.
Were used to jumping between finite representations of the same idea (5
= V = |||||). Now were seeing we can convert between a finite and infinite
representation of an idea, similar to 31 = .333 . . . = .3 + .03 + .003 + . . ..
When we turned a circle into a ring-triangle, we said The infinitely-many
rings in our circle can be turned into the infinitely-many boards that make up
a triangle. And the resulting triangle is easy to measure.
Todays goal isnt to become experts with infinity. Its to intuitively appreciate a practical conclusion: a sequence of infinitely many parts can still be
measured, and reach the same conclusions as analyzing the whole. Theyre just
two different descriptions of the same idea.
CHAPTER
10
T HE T HEORY O F D ERIVATIVES
The last lesson showed that an infinite sequence of steps could lead to a finite conclusion. Lets put it into practice, and see how breaking change into
infinitely small parts can point to the the true amount.
Analogy: Measuring Heart Rates

Imagine youre a doctor trying to measure a patients heart rate while exercising. You put a guy on a treadmill, strap on the electrodes, and get him running.
The machine spit out 180 beats per minute. That must be his heart rate, right?
Nope. Thats his heart rate when observed by doctors and covered in electrodes. Wouldnt that scenario be stressful? And what if your Nixon-era electrodes get tangled on themselves, and tug on his legs while running?
Ah. We need the electrodes to get some measurement. But, right afterwards, we need to remove the effect of the electrodes themselves. For example, if we measure 180 bpm, and knew the electrodes added 5 bpm of stress,
wed know the true heart rate was 175.
The key is making the knowingly-flawed measurement, to get a reading,
then correcting it as if the instrument wasnt there.
Measuring the Derivative

Measuring the derivative is just like putting electrodes on a function and making it run. For f (x) = x 2 , we stick an electrode of +1 onto it, to see how it
reacted:
43
CHAPTER 10. THE THEORY OF DERIVATIVES
44
The horizontal stripe is the result of our change applied along the top of
the shape. The vertical stripe is our change moving along the side. And whats
the corner?
Its part of the horizontal change interacting with the vertical one! This
is an electrode getting tangled in its own wires, a measurement artifact that
needs to go.
Throwing Away The Artificial Results

The founders of calculus intuitively recognized which components of change
were artificial and just threw them away. They saw that the corner piece
was the result of our test measurement interacting with itself, and shouldnt be
included.
In modern times, we created official theories about how this is done:
Limits: We let the measurement artifacts get smaller and smaller until
they effectively disappear (cannot be distinguished from zero).
Infinitesimals: Create a new type of number that lets us try infinitelysmall change on a separate, tiny number system. When we bring the
result back to our regular number system, the artificial elements are removed.
There are entire classes dedicated to exploring these theories. The practical
upshot is realizing how to take a measurement and throw away the parts we
dont need.
Heres how the derivative is defined using limits:
Step
Example
Prereq: Start with a function to study
f (x) = x 2
1: Change the input by dx, our test change
f (x + d x) = (x + d x)2 = x 2 + 2x d x + (d x)2
2: Find the resulting change in output, d f
f (x + d x) f (x) = 2x d x + (d x)2
3: Find
df
dx
4: Throw away the measurement artifacts
2xd x+(d x)2

dx
= 2x + d x
dx = 0
2x + d x = 2x

Wow! We found the official derivative for
d 2
dx x
45
on our own:
Now, a few questions:

df
Why do we measure d x , and not the actual change df? Think of df as

the raw change that happened as we made a step. The ratio is helpful
because it normalizes comparisons, showing us how much the output
reacts to the input. Now, sometimes it can be helpful to isolate the actual
df
change that happened in an interval, and rewrite d x = 2x as d f = 2x d x .
How do we set dx to 0? I see dx as the size of the instrument used to
measure the change in a function. After we have the measurement with
a real instrument (2x + d x ), we figure out what the measurement would
be if the instrument wasnt there to interfere (2x ).
But isnt the 2x + 1 pattern correct? The integers have the squares 0,
1, 4, 9, 16, 25 which has the difference pattern 1, 3, 5, 7, 9. Because
the integers must use a fixed interval of 1, using d x = 1 and keeping it
around is a perfectly accurate way to measure how they change. However, decimals dont have a fixed gap, so 2x is the best result for how fast
the change between 22 and (2.0000001)2 is happening. (Except, replace
2.0000001 with the number immediately following 2.0, whatever that is).
If theres no +1, when does the corner get filled in? The diagram
shows how area grows in the presence of a crude instrument, dx. It isnt
a decree on how more area is added. The actual change of 2x is the
horizontal and vertical strip. The corner represents a change from the
vertical strip interfering with the horizontal one. Its area, sure, but it
wont be added on this step.I imagine a square that grows by two strips,
melts to absorb the area (forming a larger square), then grows again,
then melts, and so on. The square isnt staying still long enough to
have the horizontal and vertical extensions interact.
Practical conclusion: Weve can start with a knowingly-flawed measurement ( f 0 (x) 2x + d x ), and deduce the result it points to ( f 0 (x) = 2x ). The
theories of exactly *how* we throw away d x arent necessary to master today.
The key is realizing there are measurement artifacts that can be removed when
modeling how pattern changes.
46
(Still shaky about exactly how dx can appear and disappear? Good. This
question took mathematicians decades to figure out. Heres a deeper discussion of
how the theory works, but remember this: When measuring, ignore the effect of
the instrument.)
CHAPTER
11
T HE F UNDAMENTAL T HEOREM O F C ALCULUS

(FTOC)
The Fundamental Theorem of Calculus is the big aha! moment, and something
you might have noticed all along:
X-Ray and Time-Lapse vision let us see an existing pattern as an accumulated sequence of changes
The two viewpoints are opposites: X-Rays break things apart, TimeLapses put them together
This might seem obvious, but its only because weve explored several
examples. Is it truly obvious that we can separate a circle into rings to find the
area? The Fundamental Theorem of Calculus gently reminds us we have a few
ways to look at a pattern. (Might I suggest the ring-by-ring viewpoint? Makes
things easier to measure, I think.)
Part 1: Shortcuts For Definite Integrals

If derivatives and integrals are opposites, we can shortcut the manual accumulation process involved in definite integrals.
For example, what is 1 + 3 + 5 + 7 + 9? (Lets accumulate discrete steps
because theyre simpler to visualize.)
The hard way, using the definite integral, is to start cranking through the
addition. The easy way is to realize that pattern came from a growing square.
We know the last change (+9) happened at x=4, so weve built up to a 55
square. Therefore, the sum of the entire sequence is 25:
47
CHAPTER 11. THE FUNDAMENTAL THEOREM OF CALCULUS (FTOC)
48
Neat! If we have the original pattern, we have a shortcut to measure the

size of the steps.
How about a partial sequence like 5 + 7 + 9? Well, just take the total
accumulation and subtract the part were missing (in this case, the missing 1
+ 3 represents a missing 22 square).
And yep, the sum of the partial sequence is: 55 - 22 = 25 - 4 = 21.

I hope the strategy clicks for you: we avoid manually computing the definite
integral by finding the original pattern.
Heres the first part of the FTOC in fancy language. If we have pattern of
steps and the original pattern, the shortcut for the definite integral is:
Z
b
a
st eps(x)d x = Or i g i nal (b) Or i g i nal (a)
Intuitively, I read this as Adding up all the changes from a to b is the same
as getting the difference between a and b. Formally, youll see f (x) = st eps(x)
and F (x) = Or i g i nal (x), which I think is confusing. Label the steps as steps,
and the original as the original.
Why is this cool? The definite integral is a gritty mechanical computation,
and the indefinite integral is a nice, clean formula. Just take the difference
between the endpoints to know the net result of what happened in the middle!
(That makes sense, right?)
Part 2: Finding The Indefinite Integral

Ok. Part 1 said that if we have the original function, we can skip the manual
computation of the steps. But how do we find the original?
FTOC Part Deux to the rescue!
Lets pretend theres some original function (currently unknown) that tracks
the accumulation:
Z
Accumul at i on(x) =
st eps(x)d x
a
The FTOC says the derivative of that magic function will be the steps we
have:
Accumul at i on 0 (x) = st eps(x)
49
Now we can work backwards. If we can find some random function, take its
derivative, notice that it matches the steps we have, we can use that function
as our original!
Skip the painful process of thinking about what function could make the
steps we have. Just take a bunch of them, break them, and see which matches
up. Its our vase analogy, remember? The FTOC gives us official permission
to work backwards. In my head, I think The next step in the total accumuation
is our current amount! Thats why the derivative of the accumulation matches
the steps we have.
Technically, a function whose derivative is equal to the current steps is
called an anti-derivative (One anti-derivative of 2 is 2x; another is 2x + 10).
The FTOC tells us any anti-derivative will be the original pattern (+C of course).
This is surprising its like saying everyone who behaves like Bill Cosby is
Bill Cosby. But in calculus, if a function has steps that match the ones were
looking at, its the original source.
The practical conclusion is integration and differentiation are opposites.
Have a pattern of steps? Integrate to get the original. Have the original?
Differentiate to get the pattern of steps. Jump back and forth as many times as
you like.
Next Steps
Phew! This was a theory-heavy set of lessons. It was hopefully enough to
make the connections click. Im most interested in developer metaphorical understanding, and leave the detailed proofs to a follow-up. No need to recite
the recipe before tasting the meal.
The key insights from:
Infinity: A finite result can be viewed with a sequence of infinite steps
Derivatives: We can take a knowingly-flawed measurement and morph
it into the ideal one
Fundamental Theorem Of Calculus: We can use the original function
as a shortcut to track the intermediate steps
In the upcoming lessons, well work through a few famous calculus rules
and applications. The ultimate goal will be to work out, for ourselves, how to
make this happen:
50
We have an intuitive understanding the sequence above is possible. By the

end, youll be able to play the music on your own.
CHAPTER
12
T HE B ASIC A RITHMETIC O F C ALCULUS

Remember learning arithmetic? After seeing how to multiply small numbers,
we learned how to multiply numbers with several digits:
13 15 = (10 + 3)(10 + 5) = 100 + 30 + 50 + 15
We cant just combine the first digits (1010) and the second (35) and
call it done. We have to walk through the cross-multiplication.
Calculus is similar. If we have the whole function, we can blithely say
that f (x) has derivative f 0 (x). But that isnt illuminating, or explaining what
happens behind the scenes.
If we can describe our function in terms of a building block x (such as
f (x) = 3x 2 + x ), then we should be able to find the derivative, the pattern of
changes, in terms of that same building block. If we have two types of building
blocks ( f = a b ), well get the derivative in terms of those two building blocks.
Heres the general strategy:
Imagine a scenario with a few building blocks (ar ea = l eng t h wi d t h )

Let every component change
Measure the change in the overall system
Remove the measurement artifacts (our instruments interfering with each
other)
Once we know how systems break apart, we can reverse-engineer them into
the integral (yay for the FTOC!).
Addition
Lets start off easy: how does a system with two added components behave?
In the real world, this could be sending two friends (Frank and George) to
build a fence. Lets say Frank gets the wood, and George gets the paint. Whats
the total cost?
Total = Franks cost + Georges cost
t (x) = f (x) + g (x)
51
CHAPTER 12. THE BASIC ARITHMETIC OF CALCULUS
52
The derivative of the entire system, dd xt , is the cost per additional foot. Intuitively, we suspect the total increase is the sum of the increases in the parts:
d f dg
dt
=
+
dx dx dx
That relationship makes sense, right? Lets say Franks cost is $3/foot for
the wood, and George adds $0.50/foot for the paint. If we ask for another foot,
the total cost will increase by $3.50.
Heres the math for that result:
Original: f + g
New: ( f + d f ) + (g + d g )
Change: ( f + d f ) + (g + d g ) ( f + g ) = d f + d g
In my head, I imagine x , the amount you requested, changing silently in a
corner. This creates a visible change in f (size d f ) and g (size d g ), and we see
the total change as d f + d g .
It seems we should just combine the total up front, writing t ot al = 3.5x not
t ot al = f (x)+g (x) = 3x +0.5x . Normally, we would simplify an equation, but its
sometimes helpful to list every contribution (total = base + shipping + tax).
In our case, we see most of the increase is due to the Frank (the cost of wood).
Remembering the derivative is the per dx rate, we write:
d f dg
d
f (x) + g (x) =
+
dx
dx dx
But ugh, look at all that notation! Lets trim it down:

Write f instead of f (x). Well assume a single letter is an entire function,
and by the Third Edict of The Grand Math Poombahs, our functions will
use a parameter x .
Well express the derivative using a single quote ( f 0 ), not with a ratio
df
( d x ). Were most interested in the relationship between the parts (addition), not the gritty details of the parts themselves.
So now the addition rule becomes:
( f + g )0 = f 0 + g 0
Much better! Heres how I read it: Take a system made of several parts:
( f + g ). The change in the overall system, ( f + g )0 , can be found by adding the
change from each part.
Multiplication
Lets try a tricker scenario. Instead of inputs that are added (almost oblivious
to each other), what if they are multiplied?
Suppose Frank and George are making a rectangular garden for you. Frank
handles the width and George takes care of the height. Whenever you clap,
they move. . . but by different amounts!
53
Franks steps are 3-feet long, but Georges are only 2-feet long (zookeeping
accident, dont ask). How can we describe the system?
ar ea = wi d t h hei g ht = f (x) g (x)
f (x) = 3x
g (x) = 2x
We have linear parts, so the derivatives are simple: f 0 (x) = 3 and g 0 (x) = 2.
What happens on the next clap?
Looks familiar! We have a horizontal strip, a vertical strip, and a corner

piece. We can work out the amounts with algebra:
Original: f g
New: ( f + d f ) (g + d g ) = ( f g ) + ( f d g ) + (g d f ) + (d f d g )
54
Change: f d g + g d f + d f d g
Lets see this change more closely:
The horizontal strip happened when f changed (by d f ), and g was the
same value
The vertical strip was made when g changed (by d g ), and f was the
same value
The corner piece (d f d g ) happened when the change in one component
(d f ) interacted with the change in the other (d g )
The corner piece is our sample measurement getting tangled on itself, and
should be removed. (If were forced to move in whole units, then the corner
is fine. But most real-world systems can change continuously, by any decimal
number, and we want the measurement artifacts removed.)
To find the total change, we drop the d f d g term (interference between
the changes) and get:
f dg +g d f
I wont let you forget the derivative is on a per dx basis, so we write:

dg
df
t ot al chang e
=f
+g
dx
dx
dx
( f g )0 = f g 0 + g f 0
There is an implicit x changing off in the distance, which makes f and g

move. We hide these details to make the notation simpler.
In English: Take a scenario with multiplied parts. As they change, and
continue to be multiplied, add up the new horizontal and vertical strips that
are formed.
Lets try out the rule: if we have a 128 garden and increment by a whole
step, what change will we see?
In this case, well use the discrete version of the rule since were forced to
move as a whole step:
Vertical strip: f d g = 12 2 = 24
Horizontal strip g d f = 8 3 = 24
Corner piece: d f d g = 3 2 = 6
Total change: 24 + 24 + 6 = 54
Lets test it. We go from 128 (96 square feet) to 1510 (150 square feet).
And yep, the area increase was 150 - 96 = 54 square feet!
Simple Division (Inverses)

Inverses can be tough to visualize: as x gets bigger,
slow.
1
x
gets smaller. Lets take it
55
Suppose youre sharing a cake with Frank. Youve just cut it in half, about
to take a bite and. . . George shuffles in. He looks upset, and youre not about
to mention the fresh set of claw marks.
But youve just the cake in half, what can you do?
Cut it again. You and Frank can cut your existing portion in thirds, and give
George a piece:
Neat! Now everyone has 1/3 of the total. You gave up 1/3 of your amount
(1/2), that is, you each gave George 1/6 of the total.
Time to eat! But just as youre about to bite in. . . the veggie godfather
walks in. Oh, hell definitely want a piece. What do you do?
Cut it again. Everyone smooshes together their portion, cuts it in fourths,
and hands one piece to the Don. The cake is split evenly again.
This is step-by-step thinking applied to division:
Your original share is x1 (when x=2, you have 1/2)
Someone walks in
1
Your new share becomes x+1
How did your amount of cake change? Well, you took your original slice
1
( x1 ), cut it into the new number of pieces ( x+1
), and gave one away (the change
is negative):
1 1
1
=
x x + 1 x(x + 1)
We can probably guess that the +1 is a measurement artifact because we

forced an integer change in x. If we call the test change dx, we can find the
1
difference between the new amount ( x+1
) and the original ( x1 ):
1
1
x
x +dx
d x
=
=
x + d x x x(x + d x) x(x + d x) x(x + d x)
After finding the total change (and its annoying algebra), we divide by dx
to get the change on a per dx basis:
1
x(x + d x)
Now we remove the leftover dx, the measurement artifact:

1
1
= 2
x(x + 0)
x
Phew! Weve found how an 1/x split changes as more people are added.
56
Lets try it out: You are splitting a $1000 bill among 5 people. A sixth
person enters, how much money do you save?
Youll personally save 1/5 1/6 = 1/30 of the total cost (cut your share into
6 pieces, give the new guy one portion to pay). Thats about 3%, or 30. Not
bad for a quick calculation!
Lets work it backwards: how large is our group when were saving about
1
, well hit that
$100 per person? Well, $100 is 1/10 of the total. Since 312 10
savings rate around x=3 people.
And yep, going from 3 to 4 people means each persons share goes from
$333.33 to $250 about $100. Not bad! (If we added people fractionally, we
could hit the number exactly.)
Questions
We didnt explicitly talk about scaling by a constant, such as finding the derivative of f (x) = 3x . Can you use the product rule to figure out how it changes?
(Hint: imagine a rectangle with a fixed 3 for one side, and x for the other).
Now, how about the addition rule? How would f (x) = x + x + x behave?
CHAPTER
13
PATTERNS I N T HE R ULES
Weve uncovered the first few rules of calculus:
( f + g )0 = f 0 + g 0
( f g )0 = f g 0 + g f 0
0
1
1
= 2
x
x
Instead of blasting through more rules, step back. Is there a pattern here?
Combining Perspectives
Imagine a function as a business with interacting departments, or a machine
with interconnected parts. What happens when we make a change?
If our business has 4 departments, there are 4 perspectives to account for.
It sounds painfully simple when written out: count the changes coming from
each part!
No matter the specific interaction between parts F and G (addition, subtraction, multiplication, exponents. . . ), we just have two perspectives to consider:
57
CHAPTER 13. PATTERNS IN THE RULES
58
Aha! Thats why the rules for ( f +g )0 and ( f g )0 are two added perspectives.
The derivative of X1 has a perspective because theres just one moving part,
x. (If you like, there is a contribution from 1 about the change it experiences:
nothing. No matter how much you yell, 1 stays 1.)
The exact contribution from a perspective depends on the interaction:
With addition, each part adds a direct change (d f + d g ).
With multiplication, and each part thinks itll add a rectangular strip ( f
d g + g d f ). (Im using d f instead of f 0 to help us think about the slice
being added.)
You might forget the exact form of the multiplication rule. But you can
think The derivative of f g must be something with d f + something with
d g .
Lets go further: what about the derivative of a b + c ? You guessed it, 3
perspectives that should be added: something involving d a plus something
involving d b plus something involving d c .
We can predict the shape of derivatives for gnarly equations. Whats the
derivative of:
xy
u
v
Wow. I cant rattle that off, but I can say itll be something involving 4
additions (d x , d y , d u and d v ). Guess the shape of a derivative, even if you
dont know the exact description.
Why does this work? Well, suppose we had a change that was influenced
by both d a and d b , such as 15 d a d b thatd be our instrument interfering
with itself!
Only direct changes on a single variables are counted (such as 3d a or
12d b ), and "changes on changes" like 15 d a d b are ignored.
Dimensional Intuition
Remember that derivatives are a fancier form of division. What happens when
3
we make a division like xx ? We divide volume by length, and get area (one
dimension down).
What happens when we do ddx x 3 ? You might not know yet, but you can bet
well be dropping a dimension.
Dimensions
Example
Description
x3
Volume (cubic growth)
x2
Area (square growth)
Length (linear growth)
Constant (no change)
-1
1
x
Inverse length (per length)

-2
-3
1
x2
1
x3
59
Inverse area (per area)

Inverse volume (per volume)
When we divide or take a derivative, we drop a dimension and hop down

the table. Volume is built from area (slices d x thick), area is built from length
(d x wide).
A constant value, like 3, has no dimension in the following sense: it never
produces a set of slices. There will never be a jump to the next value. Once
we have a constant value, we get stuck on the table because the 0 pattern
always produces another 0 pattern (volume to area to lines to constant to zero
to zero. . . ).
We can have negative dimensions as well: per length, per area, per
volume, etc. Derivatives still decrease the dimension, so when seeing x1 , we
know the derivative will resemble x12 as the dimension drops a level.
An important caveat: Calculus only cares about quantities, not their dimension. The equations will happily combine x and x 2 , even though we know you
cant mix length with area. Units add a level of meaning from outside the equation, that help keep things organized and warn us if weve gone awry (if you
differentiate area and get volume, you know something went wrong).
Thinking With Dimensions

You can think about the dimension of a derivative without digging into a specific formula. Imagine the following scenario:
Take a string and wrap it tight around a quarter. Take another string and
wrap it tight around the Earth.
Lengthen both strings, adding more to the end, so theres a 1-inch gap all
the way around around the quarter, and a 1-inch gap all the way around
the Earth (like having a ring floating around Saturn).
Quiz: Which scenario needed more extra string to create? Is it more
string to put a 1-inch gap around the quarter, or a 1-inch gap around the
Earth?
We can crunch through the formula, but think higher-level. Because circumference is a 1-d line, its reaction to change (the derivative) will be a constant.
No matter the current size, the circumference will change the same amount
with every push. So, the extra string needed is the same for both! (About 6.28
inches per inch of radius increase).
Now, suppose we were painting the sphere instead of putting a string around
it. Ah, well, area is squared, therefore the derivative is a dimension lower (linear). If we have a 5-inch and 10-inch sphere, and make them 1-inch bigger
each, the larger sphere will require double the extra paint.
60
Questions
Lets think about the derivative of x 3 , a growing cube.
1) What dimension should the derivative of x 3 have?
2) How many viewpoints should x 3 = x x x involve?
3) Have a guess for the derivative? Does it match with how youd imagine
a cube to grow?
CHAPTER
14
T HE FANCY A RITHMETIC O F C ALCULUS

Heres the rules we have so far:
( f + g )0 = f 0 + g 0
( f g )0 = f g 0 + g f 0
0
1
1
= 2
x
x
Lets add a few more to our collection.
Power Rule
Weve worked out that
d 2
dx x
= 2x :
We can visualize the change, and ignore the artificial corner piece. Now,
how about visualizing x 3 ?
61
CHAPTER 14. THE FANCY ARITHMETIC OF CALCULUS
62
The process is similar. We can glue a plate to each side to expand the cube.
The missing gutters represent artifacts, where our new plates would interact
with each other.
I have to keep reminding myself: the gutters arent real! They represent
growth that doesnt happen at this step. After our growth, we melt the cube
into its new, total area, and grow again. Counting the gutters would overestimate the growth that happened in this step. (Now, if were forced to take
integer-sized steps, then the gutters are needed but with infinitely-divisible
decimals, we can change smoothly.)
From the diagram, we might guess:
d 3
x = 3x 2
dx
And thats right! But we had to visualize the result. Abstractions like algebra let us handle scenarios we cant visualize, like a 10-dimensional shape.
Geometric shapes are a nice, visual starting point, but we need to move beyond
them.
We might begin analyzing a cube with using algebra like this:
(x + d x)3 = (x + d x)(x + d x)(x + d x) = (x 2 + 2x d x + (d x)2 )(x + d x) = ...
Yikes. The number of terms is getting scary, fast. What if we wanted the
10th power? Sure, there are algebra shortcuts, but lets think about the problem holistically.
Our cube x 3 = x x x has 3 components: the sides. Call them a, b and c
to keep em straight. Intuitively, we know the total change has a contribution
from each side:
63
Now what change does each side think its contributing?

a thinks: My change (d a ) is combined with the other, unmoving, sides
(b c ) to get d a b c
b thinks: My change (d b ) is combined with the other sides to get d b a c
c thinks: My change (d c ) is combined with the other sides to get d c a b
Each change happens separately, and theres no crosstalk between d a , d b
and d c (such crosstalk leads to gutters, which we want to ignore). The total
change is:
A 0 s chang es + B 0 s chang es +C 0 s chang es = (d a b c) + (d b a c) + (d c a b)
Lets write this in terms of x , the original side. Every side is identical,
(a = b = c = x ) and the changes are the same (d a = d b = d c = d x ), so we get:
(d x x x) + (d x x x) + (d x x x) = x 2 d x + x 2 d x + x 2 d x = 3x 2 d x
Converting this to a per dx rate we have:

d 3
x = 3x 2
dx
Neat! Now, the brain-dead memorization strategy is to think Pull down

the exponent and decrease it by one. That isnt learning!
Think like this:
x 3 has 3 identical perspectives.
When the system changes, all 3 perspectives contribute identically. Therefore, the derivative will be 3 somet hi ng .
The something is the change in one side (d x ) multiplied by the remaining sides (x x ). The changing side goes from x to d x and the exponent
lowers by one.
We can reason through the rule! For example, whats the derivative of x 5 ?
Well, its 5 identical perspectives (5 somet hi ng ). Each perspective is me
changing (d x ) and the 4 other guys staying the same (x x x x = x 4 ). So the
combined perspective is just 5x 4 .
The general Power Rule:
d n
x = nx n1
dx
Now we can memorize the shortcut bring down the exponent and subtract, just like we know that putting a 0 after a number multiplies by 10.
Shortcuts are fine once you know why they work!
64
Integrals of Powers
Lets try integrating a power, reverse engineering a set of changes into the
original pattern.
Imagine a construction site. Day 1, they order three 11 wooden planks.
The next day, they order three 22 wooden planks. Then three 33 planks.
Then three 44 planks. What are they building?
My guess is a cube. They are building a shell, layer by layer, and perhaps
putting grout between the gutters to glue them together.
Similarly, if we see a series of changes like 3x 2 , we can visualize the plates
being assembled to build a cube:
Z
3x 2 = x 3
Ok we took the previous result and worked backward. But what about
the integral of plain old x 2 ? Well, just imagine that incoming change is being
split 3 ways:
x2 =
x2 x2 x2 1 2
+
+
= 3x
3
3
3
3
Ah! Now we have 3 plates (each 1/3 of the original size) and we can
integrate a smaller cube. Imagine the incoming material being split into 3
piles to build up the sides:
Z
x =
1
1
3x 2 =
3
3
1
3x 2 = x 3
3
If we have 3 piles of size x 2 , we can make a full-sized cube. Otherwise, we

build a mini-cube, 1/3 as large.
The general integration rule is:
Z
xn =
1
x n+1
n +1
After some practice, youll do the division automatically. But now you know
why its needed: we have to split the incoming change material among several sides. (Building a square? Share changes among 2 sides. Building a cube?
Share among 3 sides. Building a 4d hypercube? Call me.)
The Quotient Rule

Weve seen the derivative of an inverse (a simple division):
d 1
1
= 2
dx x
x
Remember the cake metaphor? We cut our existing portion ( x1 ) into x slices,
and give one away.
f
Now, how can we find the derivative of g ? One component in the system is
trying to grow us, while the other divides us up. Which wins?
65
Abstraction to the rescue. When finding the derivative of x 3 , we imagined

it as x 3 = a b c , which helped simplify the interactions. Instead of a mishmash
of xs being multiplied, it was just 3 distinct perspectives to consider.
f
Similarly, we can rewrite g as two perspectives:
f
= a b
g
We know a = f and b = g1 . From this zoomed-out view, it looks like a

normal, rectangular, product-rule scenario:
(a b)0 = d a b + d b a
Its our little secret that b is really g1 , which behaves like a division. We just
want to think about the big picture of how the rectangle changes.
Now, since a is just a rename of f , we can swap in d a = d f . But how do we
swap out b ? Well, we have:
b=
1
g
db
1
= 2
dg
g
db =
1
dg
g2
Ah! This is our cake cutting. As g grows, we lose d b = g12 dg from the b
side. The total impact is:

1
1
+ 2 dg f
(a b)0 = (d a b) + (d b a) = d f
g
g
This formula started with the product rule, and we plugged in their real
values. Might as well put f and g back into (a b)0 , to get the Quotient Rule
(aka the Division Rule):
0

f
1
1
= df
+ 2 dg f
g
g
g
Many textbooks re-arrange this relationship, like so:
1
1
g d f
f dg
g d f f dg
df
+ 2 dg f =
=
g
g
g2
g2
g2
And I dont like it, no maam, not one bit! This version no longer resembles
its ancestor, the product rule.
In practice, the Quotient Rule is a torture device designed to test your memf
orization skills; I rarely remember it. Just think of g as f g1 , and use the
product rule like weve done.
66
Questions
Lets do a few warm-ups to test our skills. Can you solve these bad boys?
d 4
x =?
dx
d
3x 5 = ?
dx
(You can check your answers with Wolfram Alpha, such as d/dx x4.)
Again, dont get lost in the symbols. Think I have x 4 what pattern of
changes will I see as I make x larger?.
Ok! How about working backwards, and doing some integrals?
Z
2x 2 = ?
x3 = ?
Ask yourself, What original pattern would create steps in the pattern 2x 2 ?
Trial-and-error is ok! Try a formula, test it, and adjust it. Personally, I like
to move aside the 2 and just worry about the integral of x 2 :
Z
2x = 2
x2 = ?
How do you know if youre right? Take the derivative you are the antiques
dealer! I brought you a pattern of shards (2x 2 ) and you need to tell me the vase
they came from. Once you have your guess, just break it in the back room, and
make sure you get 2x 2 back out. Then youll be confident in your answer.
Were getting ready to work through the circle equations ourselves, and
recreate results found by Archimedes, likely the greatest mathematician of all
time.
CHAPTER
15
D ISCOVERING A RCHIMEDES F ORMULAS

In the preceding lessons we uncovered a few calculus relationships, the arithmetic of how systems change:
How do these rules help us?

If we have an existing equation, the rules are a shortcut to find the stepby-step pattern. Instead of visualizing a growing square, or cube, the
Power Rule lets us crank through the derivatives of x 2 and x 3 . Whether
x 2 is from a literal square or multiplied elements isnt important well
get the pattern of changes.
If we have a set of changes, the rules help us reverse-engineer the original
pattern. Getting changes like 2x or 14x is a hint that somet hi ng x 2 was
the original pattern.
Learning to think with calculus means we can use X-Ray and Time-lapse
vision to imagine changes taking place, and use the rules to work out the
specifics. Eventually, we might not visualize anything, and just work with the
symbols directly (as you likely do with arithmetic today).
In the start of the course, we morphed a ring into a circle, then a sphere,
then a shell:
67
CHAPTER 15. DISCOVERING ARCHIMEDES FORMULAS
68
With the official rules in hand, we can blast through the calculations and
find the circle/sphere formulas on our own. It may sound strange, but the
formulas feel different to me almost alive when you see them morphing in
front of you. Lets jump in.
Changing Circumference To Area

Our first example of step-by-step thinking was gluing a sequence of rings to
make a circle:
When we started, we needed a lot of visualization. We had to unroll the

rings, line them up, realize they made a triangle, then use 12 base hei g ht to get
the area. Visual, tedious. . . and necessary. We need to feel whats happening
before working with raw equations.
Heres the symbolic approach:
69
Lets walk through it. The notion of a ring-by-ring timelapse sharpens into
integrate the rings, from nothing to the full radius and ultimately:
r
Z
ar ea =
2r d r
0
Each ring has height 2r and width d r , and we want to accumulate that
area to make our disc.
How can we solve this equation? By working backwards. We can move the
2 part outside the integral (remember the scaling property?) and focus on the
integral of r :
2
Z
0
r d r =?
What pattern makes steps of size r ? Well, we know that r 2 creates steps of
size 2r , which is double what we need. Half that should be perfect. Lets try it
out:
d 1 2 1 d 2 1
r =
r = 2r = r
dr 2
2 dr
2
Yep, 21 r 2 gives us the steps we need! Now we can plug in the solution to the
integral:
ar ea = 2
Z
0
1
r d r = 2 r 2 = r 2
2
This is the same result as making the ring-triangle in the first lesson, but
we manipulated equations, not diagrams. Not bad! Itll help even more once
we get to 3d. . .
Changing Area To Volume

Lets get fancier. We can take our discs, thicken them into plates, and build a
sphere:
70
Lets walk slowly. We have several plates, each at a different x-coordinate.

Whats the size of a single plate?
The plate has a thickness (d x ), and its own radius. The *radius of the plate*
is some off the x-axis, which we can call y .
Its a little confusing at first: r is the radius of the entire sphere, but y is the
(usually smaller) radius of an individual plate under examination. In fact, only
the center plate (x = 0) will have its radius the same as the entire sphere. The
end plates dont have a height at all.
And by the Pythagorean theorem, we have a connection between the xposition of the plate, and its height ( y ):
x2 + y 2 = r 2
Ok. We have size of each plate, and can integrate to find the volume, right?
Not so fast. Instead of starting on the left side, with a negative x-coordinate,
moving to 0, and then up to the max, lets just think about a sphere as two
halves:
To find the total volume, get the volume of one half, and double it. This is
a common trick: if a shape is symmetrical, get the size of one part and scale
it up. Often, its easier to work out 0 to max than min to max, especially
when min is negative.
Ok. Now lets solve it:
71
Whoa! Quite an equation, there. It seems like a lot, but well work through
it:
First off: 3 variables is too many to have flying around. Well write the
height of each plate ( y ), in terms of the others:
hei g ht = y =
r 2 x2
The square root looks intimidating at first, but its being plugged into y 2
and the exponent will cancel it out. After plugging in y, we have the much
nicer:
r
Z
V ol ume = 2
r 2 x2 d x
The parentheses are often dropped because its known that d x is multiplied by the entire size of the step. We know the step is (r 2 x 2 ) d x and not
r 2 (x 2 d x).
Lets talk about r and x for a minute. r is the radius of the entire sphere,
such as 15 inches. You can imagine asking I want the volume of a sphere
with a radius of 15 inches. Fine.
To figure this out, well create plates at each x-coordinate, from x = 0 up to
x = 15 (and double it). x is the bookkeeping entry that remembers which plate
were on. We could work out the volume from x = 0 to x = 7.5, lets say, and
wed build a partial sphere (maybe useful, maybe not). But we want the whole
shebang, so we let x go from 0 to the full r .
Time to solve this bad boy. What equation has steps like r 2 x 2 ?
First, lets use the addition rule: steps like a b are made from two patterns
(one making a , the other making b ).
Lets look at the first pattern, the steps of size r 2 . Were moving along the
x-axis, and r is a number that never changes: its 15 inches, the size of our
sphere. This max radius never depends on x , the position of the current plate.
72
When a scaling factor doesnt change during the integral (r , , etc.), it can
be moved outside and scaled up at the end. So we get:
Z
r2 dx = r2
d x = r 2x
In other words, r 2 x is a linear trajectory that contributes a constant r 2 at

each step.
Cool. How about the integral of x 2 ? First, we can move out the negative
sign and take the integral of x 2 :
Z
x 2 d x =?
Weve seen this before. Since x 3 has steps of 3x 2 , taking 1/3 of that amount
should be just right. And we can check that our integral is correct:
3
( x3 )
1 3
1 d 3
1
d
x =
x = 3x 2 = x 2
dx
3
3 dx
3
It works out! Over time, youll learn to trust the integrals you reverseengineer, but when starting out, its good to check the derivative. With the
integrals solved, we plug them in:
Z
2
1
r 2 x 2 d x = 2(r 2 x x 3 )
3
Whats left? Well, our formula still has x inside, which measures the volume
from 0 to some final value of x . In this case, we want the full radius, so we set
x =r:
1
1
1
2
4
2(r 2 x x 3 ) 2 (r 2 )r r 3 = 2 r 3 r 3 = 2 r 3 = r 3
set x=r
3
3
3
3
3
Tada! Youve found the volume of a sphere (or another portion of a sphere,
if you use a different range for x ).
Think that was hard work? You have no idea. That one-line computation
took Archimedes, one of the greatest geniuses of all time, tremendous effort
to figure out. He had to imagine some spheres, and a cylinder, and some
cones, and a fulcrum, and imagine them balancing and. . . lets just say when
he found the formula, he had it written on his grave. Your current calculus
intuition would have saved him incredible effort (see this video).
Changing Volume To Surface Area

Now that we have volume, finding surface area is much easier. We can take a
thin peel of the sphere with a shell-by-shell X-Ray:
73
I imagine the entire shell as powder on the surface of the existing sphere.
How much powder is there? Its dV , the change in volume. Ok, what is the
area the powder covers?
Hrm. Think of a similar question: how much area will a bag of mulch cover?
Get the volume, divide by the desired thickness, and you have the area covered.
If I give you 300 cubic inches of dirt, and spread it in a pile 2 inches thick, the
pile will cover 150 square inches. After all, if ar ea t hi ckness = vol ume then
vol ume
ar ea = t hi
ckness .
In our case, dV is the volume of the shell, and d r is its thickness. We can
spread dV along the thickness were considering (d r ) and see how much area
we added: dV
d r , the derivative.
This is where the right notation comes in handy. We can think of the derivative as an abstract, instantaneous rate of change (V 0 ), or as a specific ratio
( dV
d r ). In this case, we want to consider the individual elements, and how they
interact (volume of shell / thickness of shell).
So, given the relation,
area of shell =
volume of shell dV
=
depth of shell
dr
we figure out:
d 4 3 4 d 3 4
d
V ol ume =
r = r = (3r 2 ) = 4r 2
dr
dr 3
3 dr
3
Wow, that was fast! The order of our morph (circumference / area / volume
/ surface area) made the last step simple. We could try to spin a circumference
into surface area directly, but its more complex.
As we cranked through this formula, we dropped the exponent on r 3 to
get 3r 2 . Remember this change comes from 3 perspectives (dimensions) that
contributed an equal share.
2000 Years Of Math In An Afternoon

The steps we worked through took 2000 years of thought to discover, by the
greatest geniuses no less. Calculus is such a broad and breathtaking viewpoint
that its difficult to imagine where it doesnt apply. Its just about using X-Ray
and Time-Lapse vision:
74
Break things down. In your current situation, whats the next thing that
will happen? And after that? Is there a pattern here? (Getting bigger,
smaller, staying the same.) Is that knowledge useful to you?
Find the source. Youre seeing a bunch of changes what caused them?
If you know the source, can you predict the end-result of all the changes?
Is that prediction helpful?
Were used to analyzing equations, but I hope it doesnt stop there. Numbers
can describe mood, spiciness, and customer satisfaction; step-by-step thinking
can describe battle plans and psychological treatment. Equations and geometry
are just nice starting points to analyze. Math isnt about equations, and music
isnt about sheet music they point to the idea inside the notation.
While there are more details for other derivatives, integration techniques,
and how infinity works, you dont need them to start thinking with Calculus.
What you discovered today would have made Archimedes tear up, and thats a
good enough start for me.
Happy math.

Calculus Better Explained

Uploaded by

Copyright:

Available Formats

Calculus Better Explained

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Calculus Better Explained

Uploaded by

Copyright:

Available Formats

C ONTENTS

1 1 Minute Calculus: X-Ray and Time-Lapse Vision

2 Practice X-Ray and Time-Lapse Vision

3 Expanding Our Intuition

4 Learning The Official Terms

5 Music From The Machine

6 Improving Arithmetic And Algebra

7 Seeing How Lines Work

8 Playing With Squares

9 Working With Infinity

10 The Theory Of Derivatives

11 The Fundamental Theorem Of Calculus (FTOC)

12 The Basic Arithmetic Of Calculus

13 Patterns In The Rules

14 The Fancy Arithmetic Of Calculus

15 Discovering Archimedes Formulas

1 M INUTE C ALCULUS : X-R AY AND

CHAPTER 1. 1 MINUTE CALCULUS: X-RAY AND TIME-LAPSE VISION

So why is Calculus useful? Well, just imagine having X-Ray or Time-Lapse

Calculus In 10 Minutes: See Patterns Step-By-Step

We have a vague feeling these formulas are connected, right?

CHAPTER 1. 1 MINUTE CALCULUS: X-RAY AND TIME-LAPSE VISION

CHAPTER 1. 1 MINUTE CALCULUS: X-RAY AND TIME-LAPSE VISION

Whoa! We have a bunch of straightened rings that form a triangle, which

So. . . What Can I Do With Calculus?

It depends. What can you do with arithmetic?

T IME -L APSE V ISION

Calculus trains us to use X-Ray and Time-Lapse vision, such as re-arranging a

CHAPTER 2. PRACTICE X-RAY AND TIME-LAPSE VISION

Now think about a slice-by-slice progression. What do you notice?

CHAPTER 2. PRACTICE X-RAY AND TIME-LAPSE VISION

We follow a sweeping circular path, never retracing our steps from an

CHAPTER 2. PRACTICE X-RAY AND TIME-LAPSE VISION

CHAPTER 2. PRACTICE X-RAY AND TIME-LAPSE VISION

Now its much easier to compare each X-Ray strategy:

CHAPTER 2. PRACTICE X-RAY AND TIME-LAPSE VISION

Rings become shells, a thick candy coating on a delicious gobstopper.

CHAPTER 3. EXPANDING OUR INTUITION

Organic processes grow in layers (pearls in an oyster)

Exploring The 3d Perspective

Well, now we have an idea how:

Circumference: Start with a single ring

CHAPTER 3. EXPANDING OUR INTUITION

Wow! These descriptions are pretty detailed. We know, intuitively, how to

The Need For Math Notation

L EARNING T HE O FFICIAL T ERMS

However, this is a very elaborate way to communicate. Heres the Official

Lets walk through the fancy names.

CHAPTER 4. LEARNING THE OFFICIAL TERMS

The Integral, Arrows, and Slices

CHAPTER 4. LEARNING THE OFFICIAL TERMS

Ah. A few notes about the variables:

Practicing The Lingo

CHAPTER 4. LEARNING THE OFFICIAL TERMS

CHAPTER 4. LEARNING THE OFFICIAL TERMS

CHAPTER 4. LEARNING THE OFFICIAL TERMS

Each should would be similar to:

The math command should be something like this:

CHAPTER 4. LEARNING THE OFFICIAL TERMS

Assume we know the volume of a sphere is 4/3 * pi * r3. Think about