This notebook describes how to write a program that will manipulate mathematical expressions, with an emphasis on differentiation. The program can compute $\frac{d}{dx} x^2 + \sin(x)$ to get the result $2x + \cos(x)$. Usually in a programming language, when we say sin(x)
we are computing with numbers, not symbolic expressions. That is, programming languages have built-in functions like sin
and numbers like 0 so that we can ask for the value of sin(0)
and get back 0. But now we want to do something different: treat $\sin(x)$ as a symbolic expression that represents any value of $x$. That facility is not built-in to most programming languages, so we need a way to represent and manipulate these symbolic expressions. Python has a large system called sympy
to do just that, but we will develop a smaller, simpler system from scratch.
Here is one simple way to represent symbolic expressions: We will represent numbers as themselves, because they are already built-in to Python. We will represent math variable (such as $x$) and constants (such as $\pi$) as Python strings (such as 'x'
and 'pi'
). And we will represent compound expressions (such as $x + 1$) as Python tuples, such as ('x', '+', 1)
.
We will use the notion of arity of an expression. An expression can be:
We can define this as a Python function:
def arity(expr):
"The number of sub-expressions in this expression."
if isinstance(expr, tuple):
return len(expr) - 1
else:
return 0
For example:
exp = ('x', '+', 1)
arity(exp)
2
This table summarizes the kinds of expressions and how we will represent them:
Math Expression | Math Type | Arity | Python Representation | Python Type | |||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
$3$ | Integer | 0 | 3int
| $3.14$ | Number | 0 | 3.14 | float
| $x$ | Variable | 0 | 'x' | str
| $\pi$ | Constant | 0 | 'pi' | str
| $-x$ | Unary op | 1 | ('-', 'x') | tuple
| $\sin(x)$ | Unary func | 1 | ('sin', 'x') | tuple
| $x+1$ | Binary op | 2 | ('x', '+', 1) | tuple
| $3x - y/2$ | Nested | Binary op 2 | ((3, '*', 'x'), '-', | ('y', '/', 2)) tuple
| |
In Calculus, differentiation is the process of computing the derivative of a function: the rate of change of the function with respect to a variable. We denote the dirivative of $y$ with respect to the variable $x$ as $\frac{dy}{dx}$.
We can compute the derrivative by consulting a table of differentiation rules. Here are the rules for variables, constants, unary negation, and the four basic binary arithmetic operations:
$\frac{d}{dx} (x) = 1$
$\frac{d}{dx} (c) = 0$
$\frac{d}{dx} (-u) = - \frac{du}{dx}$
$\frac{d}{dx} (u + v) = \frac{du}{dx} + \frac{dv}{dx}$
$\frac{d}{dx} (u - v) = \frac{du}{dx} - \frac{dv}{dx}$
$\frac{d}{dx} (u \times v) = v \frac{du}{dx} + u \frac{dv}{dx}$
$\frac{d}{dx} (u \; / \; v) = (v \frac{du}{dx} - u \frac{dv}{dx}) \; / \; v^2$
D(y, x)
¶We will define the Python function D(y, x)
to compute the symbolic derivative of the expression y
with respect to the variable x
. We can handle each of the seven rules, one by one. If the input is of a form that does not match one of the rules, we will raise a ValueError
.
def D(y, x='x'):
"""Return the symbolic derivative, dy/dx.
Handles binary operators +, -, *, /, and unary -, over variables and constants."""
if y == x:
return 1 # d/dx (x) = 1
if arity(y) == 0:
return 0 # d/dx (c) = 0
if arity(y) == 1:
(op, u) = y # y is a compund expression of the form (op, u)
if op == '-': return ('-', D(u, x)) # d/dx (-u) = - du/dx
if arity(y) == 2:
(u, op, v) = y # y is a compound expression of the form (u, op, v)
if op == '+': return D(u, x), '+', D(v, x)
if op == '-': return D(u, x), '-', D(v, x)
if op == '*': return (v, '*', D(u, x)), '+', (u, '*', D(v, x))
if op == '/': return ((v, '*', D(u, x)), '-', (u, '*', D(v, x))), '/', (v, '*', v)
raise ValueError("D can't handle this: " + str(y))
Let's see what D
can do:
D(('x', '+', 1), 'x')
(1, '+', 0)
D(((3, '*', 'x'), '+', 'c'), 'x')
((('x', '*', 0), '+', (3, '*', 1)), '+', 0)
D(('y', '*', 'y'), 'y')
(('y', '*', 1), '+', ('y', '*', 1))
These answers are all mathematically correct, but they are not simplified—better answers would be 1
, 3
, and (2, '*', 'y')
, respectively. So we have solved the basic problem, but there are enhancements we could make in several directions:
1
instead of (1, '+', 0)
.I choose to refactor first (because that will make the other two tasks easier).
I will refactor the representation of expressions, replacing tuples with a user-defined class, Expression
. Why do that? Three reasons:
x + 1
rather than the more verbose (x, '+', 1)
.(x + 1)
form on output, rather than the uglier ('x', '+', 1)
.Here's what I have in mind:
Math Expression | Math Type | Arity | Python Representation | Python Type | |||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
$3$ | Integer | 0 | 3int
| $3.14$ | Number | 0 | 3.14 | float
| $x$ | Variable | 0 | Expression('x') | Expression
| $\pi$ | Constant | 0 | Expression('pi') | Expression
| $-x$ | Unary op | 1 | Expression('-', 'x') | Expression
| $\sin(x)$ | Unary func | 1 | Expression('sin', 'x') | Expression
| $x+1$ | Binary op | 2 | Expression('+', x, 1) (given def for `x`) | Expression
| $3x - y/2$ | Nested | Binary op 2 | 3 * x - y / 2 (given defs for `x`, `y`) | Expression
| |
Note that not every math expression is an instance of the new class Expression
; we still use good old Python int
and float
for numbers. But variables, constants (like $\pi$ or $c$), and unary and binary expressions are all instances of Expression
. Note also that I considered having separate subclasses of Expression
for each of these, but I decided the complication was not worth it.
To understand the implementation of Expression
, you need to understand Python's special method names, the method names that begin and end with double underscores, like __init__
and __add__
. If you're not familiar with the concept, glance at the documentation, but here is a quick summary:
__init__(self, ...)
The constructor that defines what happens when a new object is created.__add__(self, other)
The method called when the code asks for self + other
. Similar for __sub__
, etc.__radd__(self, other)
'r' for reverse. Called when the code asks for, say, 1 + self
, where 1
is not an Expression
. Similar for __rsub__
, etc.__neg__(self)
The method called when the code says (- self)
. Similar for __pos__
.__hash__(self)
Computes a hash value, an integer, used as a key into a hash table (dict
).__eq__(self, other)
Called to check whether self == other
.from __future__ import division
class Expression(object):
"A mathematical expression (other than a number)."
def __init__(self, op, *args): self.op, self.args = op, args
def __add__(self, other): return Expression('+', self, other)
def __sub__(self, other): return Expression('-', self, other)
def __mul__(self, other): return Expression('*', self, other)
def __truediv__(self, other): return Expression('/', self, other)
def __pow__(self, other): return Expression('**', self, other)
def __radd__(self, other): return Expression('+', other, self)
def __rsub__(self, other): return Expression('-', other, self)
def __rmul__(self, other): return Expression('*', other, self)
def __rtruediv__(self, other): return Expression('/', other, self)
def __rpow__(self, other): return Expression('**', other, self)
def __neg__(self): return Expression('-', self)
def __pos__(self): return self
def __hash__(self): return hash(self.op) ^ hash(self.args)
def __ne__(self, other): return not (self == other)
def __eq__(self, other): return (isinstance(other, Expression)
and self.op == other.op
and self.args == other.args)
def __repr__(self):
"A string representation of the Expression."
op, args = self.op, self.args
n = arity(self)
if n == 0:
return op
if n == 2:
return '({} {} {})'.format(args[0], op, args[1])
if n == 1:
arg = str(args[0])
if arg.startswith('(') or op in {'+', '-'}:
return '{}{}'.format(op, arg)
else:
return '{}({})'.format(op, arg)
def arity(expr):
"The number of sub-expressions in this expression."
if isinstance(expr, Expression):
return len(expr.args)
else:
return 0
Let's first define some symbols:
a, b, c, d, u, v, w, x, y, z, pi , e = map(Expression, [
'a', 'b', 'c', 'd', 'u', 'v', 'w', 'x', 'y', 'z', 'pi', 'e'])
So, for example, a
is now an object of type Expression
such that a.op
is the string 'a'
and a.args
is the empty tuple, and arity(a)
is 0. The Python expression a + 1
is equivalent to a.__add__(1)
, which invokes the Expression.__add__
method to create a new object of type Expression
:
a + 1
(a + 1)
The Python expression 1 + a
is equivalent to 1.__add__(a)
. But since the integer 1
doesn't know how to add a Symbol
, Python instead calls a.__radd__(1)
, where radd
stands for "reversed add":
1 + a
(1 + a)
Let's try something more interesting:
(-b + (b**2 - 4*a*c)) / (2 * a)
((-b + ((b ** 2) - ((4 * a) * c))) / (2 * a))
D(y, x)
with the Expression Class¶We can now rewrite D
; notice that the code is a bit neater, since we don't have to create tuples, we can just use operators like +
.
def D(y, x=x):
"""Return the symbolic derivative, dy/dx.
Handles binary operators +, -, *, /, and unary -, over variables and constants."""
if y == x:
return 1 # d/dx (x) = 1
if arity(y) == 0:
return 0 # d/dx (c) = 0
op = y.op
if arity(y) == 1:
u = y.args[0]
if op == '-': return -D(u, x) # d/dx (-u) = - du/dx
if arity(y) == 2:
(u, v) = y.args
if op == '+': return D(u, x) + D(v, x)
if op == '-': return D(u, x) - D(v, x)
if op == '*': return D(u, x) * v + D(v, x) * u
if op == '/': return (v * D(u, x) - u * D(v, x)) / v ** 2
raise ValueError("D can't handle this: " + str(y))
D(x + 2, x)
1
D(3 * x + c, x)
(((0 * x) + 3) + 0)
D(y * y, y)
((1 * y) + (1 * y))
D(- c * x, x)
((0 * x) + (1 * -c))
Let's represent expressions like $\sin(x)$. We'll need to be clear of the difference between the function sin
and the functional expression sin(x)
. I choose to do that by making sin
be an object of type Function
, which is a subtype of Expression
, and making sin(x)
be an Expression
whose op
is sin
:
Expression | Type | Arity | op | args |
---|---|---|---|---|
`sin` | `Function` | 0 | `'sin'` | `()` |
`sin(x)` | `Expression` | 1 | `sin` | `(x,)` |
What operations does Function
need to support? We need to be able to create a new one, with an op and no args; it can inherit the __init__
method from Expression
to do this. Once we create a new one, it needs to be able to create a functional expression, like sin(x)
. We can do that with the __call__
method of Function
:
class Function(Expression):
"A mathematical function of one argument, like sin or ln."
def __call__(self, x): return Expression(self, x)
sin, cos, tan, cot, sec, csc, ln, sqrt = map(Function, [
'sin', 'cos', 'tan', 'cot', 'sec', 'csc', 'ln', 'sqrt'])
sin
sin
vars(sin)
{'args': (), 'op': 'sin'}
sin(x)
sin(x)
vars(sin(x))
{'args': (x,), 'op': sin}
(-b + sqrt(b**2 - 4*a*c)) / (2 * a)
((-b + sqrt((b ** 2) - ((4 * a) * c))) / (2 * a))
sin(x)**2 + cos(x)**2
((sin(x) ** 2) + (cos(x) ** 2))
D(y, x)
to handle Exponentiation and Functions¶In this section, we extend D(y, x)
to handle:
u ** v
.Here are the rules:
$\frac{d}{dx} (x^n) = n x^{(n-1)}$ if n is an integer
$\frac{d}{dx} (u^v) = v u^{(v-1)} * \frac{du}{dx} + u^v * \ln(u) * \frac{dv}{dx}$
$\frac{d}{dx} \sin(x) = \cos(x)$
$\frac{d}{dx} \cos(x) = -\sin(x)$
$\frac{d}{dx} \tan(x) = \sec(x)^2$
$\frac{d}{dx} \cot(x) = -\csc(x)^2$
$\frac{d}{dx} \sec(x) = \sec(x) \tan(x)$
$\frac{d}{dx} \csc(x) = - \csc(x) \cot(x)$
$\frac{d}{dx} \ln(x) = 1 / x$
$\frac{dy}{dx} = \frac{dy}{du} \frac{du}{dx}$
We can add these rules to the ones we had before, giving us:
def D(y, x=x):
"""Return the symbolic derivative, dy/dx.
Handles binary operators +, -, *, /, and unary -, as well as
transcendental trig and log functions over variables and constants."""
if y == x:
return 1 # d/dx (x) = 1
if arity(y) == 0:
return 0 # d/dx (c) = 0
if arity(y) == 1:
op, (u,) = y.op, y.args
if op == '+': return D(u, x)
if op == '-': return -D(u, x)
if u != x: return D(y, u) * D(u, x) ## CHAIN RULE
if op == sin: return cos(x)
if op == cos: return -sin(x)
if op == tan: return sec(x) ** 2
if op == cot: return -csc(x) ** 2
if op == sec: return sec(x) * tan(x)
if op == csc: return -csc(x) * cot(x)
if op == ln: return 1 / x
if arity(y) == 2:
op, (u, v) = y.op, y.args
if op == '+': return D(u, x) + D(v, x)
if op == '-': return D(u, x) - D(v, x)
if op == '*': return D(u, x) * v + D(v, x) * u
if op == '/': return (v * D(u, x) - u * D(v, x)) / v ** 2
if op == '**': return (v * u**(v-1) if u == x and isinstance(v, int) else
v * u**(v-1) * D(u, x) + u**v * ln(u) * D(v, x))
raise ValueError("D can't handle this: " + str(y))
The trickiest part is the chain rule. Consider $\frac{d}{dx} \sin(\ln(x))$. Let:
$y = \sin(\ln(x))$
$u = \ln(x)$
According to the chain rule,
$\frac{dy}{dx} = \frac{dy}{du} \frac{du}{dx}$.
So we have:
$\frac{dy}{du} = \frac{d}{du} \sin(u) = \cos(u) = \cos(\ln(x))$
$\frac{du}{dx} = \frac{d}{dx} \ln(x) = \frac{1}{x}$
Finally, we can multiply them together to get:
$\frac{dy}{dx} = \frac{dy}{du} \;\frac{du}{dx} = \cos(\ln(x)) \; \frac{1}{x} = \cos(\ln(x)) / x$.
Now let's do the same thing in Python. Let:
Y = sin(ln(x))
U = ln(x)
So let's go on with the computation of the derivative, D(y, x) = D(y, u) * D(u, x)
. We'll look at the two terms on the right:
D(Y, U)
cos(ln(x))
D(U, x)
(1 / x)
And multiply the two terms together:
D(Y, U) * D(U, x)
(cos(ln(x)) * (1 / x))
We can do the same computation all in one step:
D(Y, x)
(cos(ln(x)) * (1 / x))
First an example where all is well:
D(x ** 3, x)
(3 * (x ** 2))
But most of the time, D
gives an answer that is technically correct, but not in simplified form. The example below should yield 2*a*x + b
:
D(a*x**2 + b*x + c, x)
((((0 * (x ** 2)) + ((2 * (x ** 1)) * a)) + ((0 * x) + (1 * b))) + 0)
The next example should give 50 * (5*x - 2)**9
:
D((5*x - 2)**10, x)
(((10 * (((5 * x) - 2) ** 9)) * (((0 * x) + 5) - 0)) + (((((5 * x) - 2) ** 10) * ln((5 * x) - 2)) * 0))
And for the next example, we should get $\frac{d}{dx} \sin(\ln(x^2)) = 2 \cos(\ln(x^2)) / x$.
D(sin(ln(x**2)), x)
(cos(ln(x ** 2)) * ((1 / (x ** 2)) * (2 * (x ** 1))))
In each case, the answer is correct, but not simplified.
simp(y)
¶Now let's work on simplification. We'll define simp(y)
to return an expression that is equivalent to the expression $y$, but simpler. We should say that the notion of simpler is not at all precise. Which is simpler, $x^2 -x -12$ or $(x + 3)(x - 4)$? It depends on what you want to do next. But we can all agree that $x$ is simpler than $(2 - 1)x + 0$. Our version of simp
will just do the easiest of simplifications. There's lots we could add to make it better.
Another difficult issue is dealing with indeterminate results like dividing by 0. We would like to be able to simplify $x/x$ to 1. But that is only true if $x$ is not 0. We'll ignore that problem and go ahead and simplify $x/x$ to 1, but we'll feel a little uneasy about it. We'll also simplify 0/0 to undefined
and 1/0 to infinity
.
def simp(y):
"Simplify an expression."
if arity(y) == 0:
return y
y = Expression(y.op, *map(simp, y.args)) ## Simplify the sub-expressions first
op = y.op
if arity(y) == 1:
(u,) = y.args
if y in simp_table:
return simp_table[y]
if op == '+':
return u # + u = u
if op == '-':
if arity(u) == 1 and u.op == '-':
return u.args[0] # - - w = w
if arity(y) == 2:
(u, v) = y.args
if evaluable(u, op, v): # for example, (3 + 4) simplifies to 7
return eval(str(u) + op + str(v), {})
if op == '+':
if u == 0: return v # 0 + v = v
if v == 0: return u # u + 0 = u
if u == v: return 2 * u # u + u = 2 * u
if op == '-':
if u == v: return 0 # u - v = 0
if v == 0: return u # u - 0 = u
if u == 0: return -v # 0 - v = -v
if op == '*':
if u == 0 or v == 0: return 0 # 0 * v = u * 0 = 0
if u == 1: return v # 1 * v = v
if v == 1: return u # u * 1 = u
if u == v: return u ** 2 # u * u = u^2
if op == '/':
if u == 0 and v == 0: return undefined # 0 / 0 = undefined
if u == v: return 1 # u / u = 1
if v == 1: return u # u / 1 = u
if u == 0: return 0 # 0 / v = 0
if v == 0: return infinity # u / 0 = infinity
if op == '**':
if v == 1: return u # u ** 1 = u
if u == v == 0: return undefined # 0 ** 0 = undefined
if v == 0: return 1 # u ** 0 = 1
# If no rules apply, return y unchanged.
return y
from numbers import Number
# Deal with infinity and with undefined numbers (like infinity minus infinity)
infinity = float('inf')
undefined = nan = (infinity - infinity)
# Table of known exact values for certain functions.
# Use this, for example, to simplify sin(pi) to 0 or ln(e) to 1.
simp_table = {
sin(0): 0, sin(pi): 0, sin(pi/2): 1, sin(2*pi): 0,
cos(0): 1, cos(pi): -1, cos(pi/2): 1, cos(2*pi): 1,
tan(0): 0, tan(pi): 0, tan(pi/2): infinity, tan(2*pi): 0, tan(pi/4): 1,
ln(1): 0, ln(e): 1, ln(0): -infinity}
def evaluable(u, op, v):
"Can we evaluate (u op v) to a number? True if u and v are numbers, and not a special case like 0^0."
return (isinstance(u, Number) and isinstance(v, Number)
and not (op == '/' and (v == 0 or u % v != 0))
and not (op == '^' and (u == v == 0)))
Let's try it out. First an easy one; $1x + 0$ should be $x$: Simplifying
simp(x*(y/y) + 0)
x
Similarly, $(2 - 1)x^{\cos(y-y)} + \tan(\pi)$ should be just $x$:
simp((2 - 1) * x ** cos(y-y) + tan(pi))
x
Here are two examples where simp
helps, but does not do all the simplifications it could:
simp(D(sin(ln(x**2)), x))
(cos(ln(x ** 2)) * ((1 / (x ** 2)) * (2 * x)))
simp(D((5 * x - 2)**10, x))
((10 * (((5 * x) - 2) ** 9)) * 5)
simp(cos(pi - pi))
1
simp(D(3 * x + c, x))
3