Differentiation from first principles - x²

By Martin McBride, 2023-04-07
Tags: x squared polynomial first principles derivative
Categories: differentiation calculus


When we differentiate a function f(x) we obtain its derivative f'(x). The derivative is a function that tells us the slope of the curve for any value of x.

In this article we will see how to differentiate a function from first principles. This is a general technique that can be used to find the derivative of many different functions.

We will illustrate the technique for the specific case of x squared.

We will also derive the same result based on a geometric interpretation of the square function.

Differentiation from first principles

Here is a function f(x):

Function graph

The slope of the curve at a particular P is given by the tangent to the curve at that point. The tangent is a line that just touches the curve without crossing it.

Finding the approximate tangent

We can find the approximate value of the tangent at point P by creating a second point Q, a small distance h further along the curve:

Approximate slope

The line PQ has a slope that is approximately equal to the slope of the curve a P.

Point P has an x-value of x, so its y-value is f(x):

Point P

Point Q has an x-value of x + h, where h is some small value. Its y-value is f(x+h):

Point Q

The slope of the line is given by:

Slope formula

Where Δx, the change in x-values between P and Q, is:

Slope formula

And Δy, the change in y-values between P and Q, is:

Slope formula

So the slope of PQ is:

Slope formula

Finding the exact tangent

The calculation above is only an approximation of the slope. The problem is that it measures the gradient of the line between P and Q. In fact, P and Q have been deliberately placed quite far apart to make it clear that the slope is not accurate.

But what we really want to know is the gradient of the tangent at the point P.

One thing we can do is move point Q closer to point P. This makes the slope PQ more similar to the slope at P:

Slope formula

The x-distance between P and Q is equal to h, so the smaller we make h, the closer the points become so the more accurate the slope.

But we can't simply set h equal to zero. If we did that, P and Q would be the same point. Δx and Δy would both be zero, so the slope would be zero divided by zero, which is undefined - it could be any value. So setting h to zero tells us nothing about the slope.

What we can do is evaluate the slope as h gets closer and close to zero. This is called a limit. As h gets closer to zero, the ratio of Δy and Δx often approaches a limiting value. We call this limit dy/dx (pronounced "dee y by dee x"):

dy/dx formula

This notation tells us that dy/dx is equal to the limit of Δy over Δx as h tends to zero. This is equal to the slope of the tangent at x, so dy/dx is the derivative of f(x).

If we substitute the previous values for Δy and Δx we get:

dy/dx formula

This is the derivative of f(x) from first principles.

We can also write this using prime notation, where we use f' to represent the derivative of f. So this equation means exactly the same thing as the previous one:

dy/dx formula

Now this formula doesn't tell us anything specific on its own, because we haven't yet specified what the function f(x) is. We will use the example of the x squared function, and use the formula to find the slope of that curve.

Differentiation x squared from first principles

To differentiate x squared from first principles, we use the formula from before:

x squared formula

We then substitute x squared for f(x):

x squared formula

Multiplying out (x + h) squared gives:

x squared formula

The terms in x squared cancel out:

x squared formula

We can then cancel out a factor of h on the top and bottom:

x squared formula

The limit is then quite simple. As h tends to zero, the h term just disappears, giving:

x squared formula

So at any point on the x squared curve, the slope is just 2 x.

Verifying the result graphically

Here is a table showing the slope of the curve for various values of x, using the formula 2x for the slope:

x f'(x) = 2x
-2 -4
-1 -2
0 0
1 2
2 4

Here is a plot of x squared with tangent lines at x-positions -2 to +2, with the slopes calculated in the table. The slopes appear to match the slope of the curve:

x squared derivative graph

Geometric interpretation

Finally, we will look at a simple geometric interpretation of differentiating x squared. The square on the left has sides of length x so its area, of course, is x squared:

x squared derivative geometry

The square on the right shows what happens if we increase the side length of the square by a tiny amount h. This increases the total area of the square:

  • It adds two rectangles to the square (shown in orange), each of size x by h. The total increase in area due to both of these rectangles is 2xh
  • It also adds a small square (shown in yellow) of side h. This adds an extra area h squared.

So the change in area, Δarea, of the square after increasing each side x by a small amount h is:

x squared formula

This looks quite similar to the earlier formula. Now let's see what happens as we make h smaller:

x squared derivative geometry

The two orange rectangles get smaller, but the tiny yellow square gets much smaller, much more quickly. As h gets extremely small, the yellow square becomes so small we can ignore it altogether. This removes the term in h squared:

x squared formula

So if we look at the rate of change of the area, which is Δarea divided by h, we get:

x squared formula

Which is the same result we found previously. This is a different way of looking at the same problem, which hopefully provides an intuitive explanation as to why we ignore the term in h squared.

See also



Join the GraphicMaths Newletter

Sign up using this form to receive an email when new content is added:

Popular tags

adder adjacency matrix alu and gate angle area argand diagram binary maths cartesian equation chain rule chord circle cofactor combinations complex modulus complex polygon complex power complex root cosh cosine cosine rule cpu cube decagon demorgans law derivative determinant diagonal directrix dodecagon eigenvalue eigenvector ellipse equilateral triangle euler eulers formula exponent exponential exterior angle first principles flip-flop focus gabriels horn gradient graph hendecagon heptagon hexagon horizontal hyperbola hyperbolic function hyperbolic functions infinity integration by parts integration by substitution interior angle inverse hyperbolic function inverse matrix irrational irregular polygon isosceles trapezium isosceles triangle kite koch curve l system line integral locus maclaurin series major axis matrix matrix algebra mean minor axis nand gate newton raphson method nonagon nor gate normal normal distribution not gate octagon or gate parabola parallelogram parametric equation pentagon perimeter permutations polar coordinates polynomial power probability probability distribution product rule proof pythagoras proof quadrilateral radians radius rectangle regular polygon rhombus root set set-reset flip-flop sine sine rule sinh sloping lines solving equations solving triangles square standard curves standard deviation star polygon statistics straight line graphs surface of revolution symmetry tangent tanh transformation transformations trapezium triangle turtle graphics variance vertical volume of revolution xnor gate xor gate