{ "cells": [ { "cell_type": "markdown", "id": "57272707-93ff-4e58-bdaf-d73c4cb93fd9", "metadata": {}, "source": [ "# Introduction to Automatic Differentiation" ] }, { "cell_type": "markdown", "id": "fb007543-4157-4a4d-aef1-2713c9ce6877", "metadata": {}, "source": [ "*Automatic differentiation* (usually abbreviated as `AD`), also called *algorithmic differentiation*, is a set of techniques to evaluate the partial derivative of a function specified by a computer program.\n", "\n", "Automatic differentiation exploits the fact that every computer calculation, no matter how complicated, executes a sequence of elementary arithmetic operations (addition, subtraction, multiplication, division, etc.) and elementary functions (exp, log, sin, cos, etc.). By applying the chain rule repeatedly to these operations, partial derivatives of arbitrary order can be computed automatically, efficiently, and accurately.\n", "\n", "Automatic differentiation is distinct from *symbolic differentiation* and *numerical differentiation*." ] }, { "cell_type": "code", "execution_count": null, "id": "192b37fb-924d-4fa6-80ae-4b55d578a149", "metadata": {}, "outputs": [], "source": [ "\n", "using PyPlot\n", "using Printf\n", "using SymPy" ] }, { "cell_type": "markdown", "id": "f6b6c096-467d-4b22-b282-8d78f6fcc2ed", "metadata": {}, "source": [ "## Symbolic differentiation" ] }, { "cell_type": "code", "execution_count": null, "id": "0ad49cd8-f702-4554-971b-f28f7ca51878", "metadata": {}, "outputs": [], "source": [ "\n", "@syms q;" ] }, { "cell_type": "markdown", "id": "fea55103-98cd-4bc7-bfb2-479858475b42", "metadata": {}, "source": [ "Sigmoid function (aka logistic function):\n", "\n", "$$f_0(x) = \\frac{1}{1 + e^{-x}}$$\n", "\n", "It has the property\n", "\n", "$$\\frac{\\mathrm{d}f_0}{\\mathrm{d}x} = f_0(x) \\, (1 - f_0(x))$$\n", "\n", "that we use to verify the numerical value of the derivative." ] }, { "cell_type": "code", "execution_count": null, "id": "75c36cf2-7174-4911-8483-ac113b31ba37", "metadata": {}, "outputs": [], "source": [ "\n", "f0(x) = 1 / (1 + exp(-x))" ] }, { "cell_type": "code", "execution_count": null, "id": "32f01564-488f-41b0-a455-c332a055d2b8", "metadata": {}, "outputs": [], "source": [ "\n", "diff(f0(q), q)" ] }, { "cell_type": "code", "execution_count": null, "id": "5a378571-6258-4b24-b5b5-16433a23829a", "metadata": {}, "outputs": [], "source": [ "\n", "f0p = lambdify(diff(f0(q), q));" ] }, { "cell_type": "code", "execution_count": null, "id": "ea2ff2b6-92b4-45ef-a5de-4d37cb856893", "metadata": {}, "outputs": [], "source": [ "\n", "r = 3.0\n", "f0pr = f0p(r)" ] }, { "cell_type": "code", "execution_count": null, "id": "d271ab01-6a78-490c-aa9b-7a7be796aa2b", "metadata": {}, "outputs": [], "source": [ "\n", "f0pr ≈ f0(r) * (1.0 - f0(r))" ] }, { "cell_type": "markdown", "id": "fd816c2a-48b1-4717-8a66-313d2c7ba15d", "metadata": {}, "source": [ "For reference, here are the graphs of sigmoid function and its derivative:" ] }, { "cell_type": "code", "execution_count": null, "id": "5a43e04a-67a7-4952-890c-4e2d1b4de97f", "metadata": {}, "outputs": [], "source": [ "\n", "m = 100\n", "r = range(-6.0, 6.0, m)\n", "plot(r, f0.(r), label=L\"f_0(r)\")\n", "plot(r, f0p.(r), label=L\"f_0'(r)\", linestyle=\"dashed\")\n", "grid(true)\n", "xlabel(\"r\")\n", "title(\"Sigmoid function and its first derivative\")\n", "legend();" ] }, { "cell_type": "markdown", "id": "1fbbb67a-0ad5-47c7-ab93-5cd0d678f70d", "metadata": {}, "source": [ "## Numerical differentiation: finite difference\n", "\n", "Recall the definition of derivative:\n", "$$\\frac{\\mathrm{d}f}{\\mathrm{d}x} \n", "= \\lim_{\\Delta x \\to 0} \\frac{f(x + \\Delta x) - f(x)}{\\Delta x}\n", "= \\lim_{\\Delta x \\to 0} \\frac{f(x + \\Delta x) - f(x - \\Delta x)}{2\\Delta x}$$" ] }, { "cell_type": "markdown", "id": "e022676d-83a9-47ac-beb1-58a83a367ff1", "metadata": {}, "source": [ "Let's define two functions for numerical derivative that use the forward and the central finite difference approximations:" ] }, { "cell_type": "code", "execution_count": null, "id": "ff21bb95-1714-4b44-8b51-c4ed33944ad8", "metadata": {}, "outputs": [], "source": [ "\n", "forward(f, x, h) = (f(x + h) - f(x)) / h\n", "central(f, x, h) = (f(x + h) - f(x - h)) / (2 * h);" ] }, { "attachments": {}, "cell_type": "markdown", "id": "82eeb151-0c3d-4e87-ad50-4dec62de6267", "metadata": {}, "source": [ "The accuracy of the finite difference approximations:\n", "\n", "$$f(x + h) = f(x) + h f'(x) + \\frac{h^2}{2} f''(x) + \\frac{h^3}{6} f'''(x) + \\ldots$$\n", "$$f(x - h) = f(x) - h f'(x) + \\frac{h^2}{2} f''(x) - \\frac{h^3}{6} f'''(x) + \\ldots$$\n", "$$\\left.\\frac{\\mathrm{d}f}{\\mathrm{d}x}\\right|_{\\mathrm{forward}} = \n", "\\frac{f(x + h) - f(x)}{h} =\n", "\\frac{h f'(x) + \\frac{h^2}{2} f''(x) + \\frac{h^3}{6} f'''(x) + \\ldots}{h} \n", "= f'(x) + \\frac{h}{2} f''(x) + \\ldots$$\n", "$$\\left.\\frac{\\mathrm{d}f}{\\mathrm{d}x}\\right|_{\\mathrm{central}} = \n", "\\frac{f(x + h) - f(x - h)}{2h} =\n", "\\frac{2 h f'(x) + \\frac{h^3}{3} f'''(x) + \\ldots}{h} \n", "= f'(x) + \\frac{h^2}{3} f'''(x) + \\ldots$$" ] }, { "cell_type": "code", "execution_count": null, "id": "4ba79d05-b07c-45e6-be3e-724fd3266c76", "metadata": {}, "outputs": [], "source": [ "\n", "hs = 2.0 .^ ((-42):2:(-3));" ] }, { "cell_type": "markdown", "id": "fb0d24f5-2de3-48a2-981b-65dda44b0d84", "metadata": {}, "source": [ "The test function and its analytic derivative:" ] }, { "cell_type": "code", "execution_count": null, "id": "75065dbf-35e4-4fb5-bdab-50d96701c473", "metadata": {}, "outputs": [], "source": [ "\n", "f(x) = exp(x) + sin(x)\n", "fp(x) = exp(x) + cos(x);" ] }, { "cell_type": "code", "execution_count": null, "id": "e380a103-401f-4789-a960-d22d1ee633da", "metadata": {}, "outputs": [], "source": [ "\n", "x0 = 1.0\n", "fpx0 = fp(x0);" ] }, { "cell_type": "code", "execution_count": null, "id": "97e6075e-8017-4aba-8520-a1f13e864783", "metadata": {}, "outputs": [], "source": [ "\n", "err_f = abs.(forward.(f, x0, hs) .- fpx0)\n", "err_c = abs.(central.(f, x0, hs) .- fpx0);" ] }, { "cell_type": "code", "execution_count": null, "id": "191b2007-5a1c-49f9-9fea-07b8ac4be414", "metadata": {}, "outputs": [], "source": [ "\n", "loglog(dxs, err_f; marker=\".\", label=\"Forward [x, x+h]\")\n", "loglog(dxs, err_c; marker=\".\", label=\"Central [x-h, x+h]\")\n", "legend()\n", "grid(true)\n", "xlabel(\"step\")\n", "ylabel(\"absolute error\")\n", "title(\"Error of finite difference approximation\");" ] }, { "cell_type": "markdown", "id": "4fc24e2b-3330-4c10-8f14-3f5be5e3c3ab", "metadata": {}, "source": [ "## Automatic differentiation using dual numbers\n", "\n", "> **References and further reading**: \n", "https://blog.esciencecenter.nl/automatic-differentiation-from-scratch-23d50c699555, \n", "https://github.com/PabRod/DualDiff.jl, https://github.com/mitmath/18330/blob/spring19/05_Numerical_Differentiation.ipynb, \n", "https://book.sciml.ai/notes/08-Forward-Mode_Automatic_Differentiation_(AD)_via_High_Dimensional_Algebras/" ] }, { "cell_type": "markdown", "id": "bbbf6246-0908-4f29-87be-03c47f83014e", "metadata": {}, "source": [ "### Dual numbers\n", "\n", "A dual number is very similar to a vector with just two elements:\n", "\n", "$$z = (u, u') .$$\n", "\n", "The first element represents the **value** of a function at a given point, and the second one is the **value** of its derivative at the same point. \n", "\n", "For instance, the *constant* 3 is written as the dual number `(3, 0)` - since it is a constant, its derivative is 0.\\\n", "The *variable* x = 3 is written as `(3,1)`- the 1 meaning that the derivative of x with respect to x is 1." ] }, { "cell_type": "markdown", "id": "ccd1f9cb-e4bc-4807-9f39-99d249d17ebb", "metadata": {}, "source": [ "Let’s start defining addition, subtraction, and multiplication by a scalar. We decide they follow exactly the same rules that vectors do:\n", "\n", "$$(u, u') + (v, v') = (u + v, u' + v'),$$\n", "$$(u, u') - (v, v') = (u - v, u' - v'),$$\n", "$$k \\cdot (u, u') = (ku, ku').$$" ] }, { "cell_type": "markdown", "id": "bccac39c-b100-4eb0-9c0e-2e159076089d", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "The multiplication is defined in a more interesting way:\n", "\n", "$$(u, u') \\cdot (v, v') = (u v, u'v + uv').$$\n", "\n", "Why? Because the second term represents a derivative; it has to follow the product rule for derivatives." ] }, { "cell_type": "markdown", "id": "19eb4650-9ae0-4a59-816f-ad0cf217b9f0", "metadata": {}, "source": [ "The division of dual numbers follows the quotient rule for derivatives:\n", "\n", "$$\\frac{(u,u')}{(v,v')} = \\left(\\frac{u}{v}, \\frac{u'v - uv'}{v^2}\\right).$$" ] }, { "cell_type": "markdown", "id": "85387206-2944-44fe-804d-fd9e1303cf77", "metadata": {}, "source": [ "Last, the power of a dual number is defined as:\n", "\n", "$$(u, u')^v = (u^v, vu^{v-1}u').$$" ] }, { "cell_type": "markdown", "id": "ccce4dcc-9f5b-4eb8-9b36-b8d49c08d91b", "metadata": {}, "source": [ "The operations defined above cover a lot of ground. Indeed, any algebraic operation can be built using them as basic components. This means that we can pass a dual number as the argument of an algebraic function, and here comes the magic, the result will be:\n", "\n", "$$f_{\\mathrm{algebraic}}((x,1)) = (f(x), f'(x))$$\n", "\n", "The equation above tells us that just by feeding the function the dual number (x, 1) it return its value at, plus its derivative! " ] }, { "cell_type": "markdown", "id": "028b2fc9-5044-4689-ae81-e259f58bc828", "metadata": {}, "source": [ "### Implementation of Dual numbers\n", "\n", "The rules of differentiation turn a calculus problem into an algebra one.\n", "So, how do these rules look in Julia? First of all, we need to define a Dual object, representing a dual number. \n", "It is as simple as a container for two real numbers:" ] }, { "cell_type": "code", "execution_count": null, "id": "602f1845-e54e-4f62-99a9-90e5d278c5a4", "metadata": {}, "outputs": [], "source": [ "# Structure representing a Dual number\n", "struct Dual{T<:Real}\n", " val::T\n", " deriv::T\n", "end" ] }, { "cell_type": "code", "execution_count": null, "id": "0b67c529-3131-47a0-aa86-12683dc817df", "metadata": {}, "outputs": [], "source": [ "w = Dual(1,1)" ] }, { "cell_type": "markdown", "id": "75782460-5e3d-47c9-a515-6de64b9d3b5b", "metadata": {}, "source": [ "### Operations with Dual numbers" ] }, { "cell_type": "markdown", "id": "445c4634-f05d-486a-ab95-7f56d4aac9f1", "metadata": {}, "source": [ "Let's start with the operation of addition, `+`:" ] }, { "cell_type": "code", "execution_count": null, "id": "774443ee-74fc-4181-8424-8a83fa1a8465", "metadata": {}, "outputs": [], "source": [ "import Base: +\n", "\n", "+(f::Dual, g::Dual) = Dual(f.val + g.val, f.deriv + g.deriv)\n", "+(f::Dual, a::Number) = Dual(f.val + a, f.deriv)\n", "+(a::Number, f::Dual) = f + a" ] }, { "cell_type": "code", "execution_count": null, "id": "152eebf0-07a4-45ac-aef1-72c89beb9ea2", "metadata": {}, "outputs": [], "source": [ "y = Dual(2, 1)" ] }, { "cell_type": "code", "execution_count": null, "id": "f3278476-a87f-4f24-98c3-450e39a0ecd6", "metadata": {}, "outputs": [], "source": [ "y + w" ] }, { "cell_type": "code", "execution_count": null, "id": "9d649eb2-2c72-4476-bb05-594caa2626e0", "metadata": {}, "outputs": [], "source": [ "w + 1" ] }, { "cell_type": "code", "execution_count": null, "id": "89649488-2d19-4cdd-bc55-33eb2af93c9f", "metadata": {}, "outputs": [], "source": [ "w + Dual(0.0, 0.0)" ] }, { "cell_type": "markdown", "id": "790e0a0e-5b07-4d28-99ab-f46667fbad3e", "metadata": {}, "source": [ "Let's define the subtraction of two Dual numbers:" ] }, { "cell_type": "code", "execution_count": null, "id": "0277e622-1ae0-4279-81c0-136a4fab23ea", "metadata": {}, "outputs": [], "source": [ "import Base: -\n", "\n", "-(f::Dual, g::Dual) = Dual(f.val - g.val, f.deriv - g.deriv)\n", "-(f::Dual, a::Number) = Dual(f.val - a, f.deriv)\n", "-(f::Dual) = Dual(-f.val, -f.deriv) # the negation of a dual number\n", "-(a::Number, f::Dual) = -f + a" ] }, { "cell_type": "code", "execution_count": null, "id": "5189b932-d8f0-487f-aeb6-b9a28f008e9d", "metadata": {}, "outputs": [], "source": [ "w - y" ] }, { "cell_type": "code", "execution_count": null, "id": "80169658-2b87-477f-a872-a187535a5540", "metadata": {}, "outputs": [], "source": [ "w - 1" ] }, { "cell_type": "code", "execution_count": null, "id": "224839dc-5cc5-4a20-a732-ddb3029f3fa2", "metadata": {}, "outputs": [], "source": [ "1 - w" ] }, { "cell_type": "markdown", "id": "16ca15a0-e94b-4912-9310-076bcb9ea800", "metadata": {}, "source": [ "More interesting stuff happens with multiplication, division, and with power to a real number:" ] }, { "cell_type": "code", "execution_count": null, "id": "e96f9725-1b0c-49be-99c7-f573254854df", "metadata": {}, "outputs": [], "source": [ "import Base: *, /, ^\n", "\n", "# Product Rule\n", "*(f::Dual, g::Dual) = Dual(f.val * g.val, f.deriv * g.val + f.val * g.deriv)\n", "*(a::Number, f::Dual) = Dual(f.val * a, f.deriv * a)\n", "*(f::Dual, a::Number) = a * f\n", "\n", "# Quotient Rule\n", "/(f::Dual, g::Dual) = Dual(f.val / g.val, (f.deriv * g.val - f.val * g.deriv) / (g.val^2))\n", "/(a::Number, f::Dual) = Dual(a / f.val, -a * f.deriv / f.val^2)\n", "/(f::Dual, a::Number) = f * inv(a)\n", "\n", "# Power Rule\n", "^(f::Dual, x::Number) = Dual(f.val ^ x, x * f.val ^ (x - one(x)) * f.deriv);" ] }, { "cell_type": "markdown", "id": "09a7e0c5-2c10-4db6-b3f2-ded54526c80d", "metadata": {}, "source": [ "Let's teach Julia that $\\sqrt(x)$ is just anothe notation for $x^{1/2}$:" ] }, { "cell_type": "code", "execution_count": null, "id": "04f15e8d-51ba-4722-9a87-54d4a0091a6e", "metadata": {}, "outputs": [], "source": [ "import Base: sqrt\n", "\n", "sqrt(z::Dual) = z^(1/2)" ] }, { "cell_type": "markdown", "id": "8fa95856-e976-4c26-ab8f-6384c0e87f74", "metadata": {}, "source": [ "### Examples\n", "\n", "*Algebraic functions* are expressions that use a finite number of terms, involving only the algebraic operations addition, subtraction, multiplication, division, and raising to a fractional power. Examples of such functions are as follows:\n", "\n", "$$f_1(x) = x^3 + x^2 + x + 1, \\quad f_2(x) = 1/(x + 1), \\quad f_3(x) = \\sqrt{x + 1/x}, \\quad f_4(x) = \\frac{{\\sqrt{1+x^3}}}{x^{3/7}-\\sqrt{7}x^{1/3}}$$" ] }, { "cell_type": "code", "execution_count": null, "id": "1a1edb6b-b125-4e56-b893-d4828c4f1780", "metadata": {}, "outputs": [], "source": [ "using BenchmarkTools" ] }, { "cell_type": "markdown", "id": "690a457c-cae3-488f-ae1e-f8ae78932f5f", "metadata": {}, "source": [ "Let’s start by calculating the derivative of the polynomial, $f_1(x)$, at $x = 3$ (note that $f_1(3) = 40$). \n", "To check the result of the automatic differentiation, we compute the derivative by hand: $f'_1(x) = 3 x^2 + 2 x + 1$, $f'_1(3) = 34$." ] }, { "cell_type": "code", "execution_count": null, "id": "5e035a0a-7b49-4bf2-9b7a-e066db79f820", "metadata": {}, "outputs": [], "source": [ "z = Dual(3.0, 1.0)" ] }, { "cell_type": "code", "execution_count": null, "id": "cd8b3e0f-9511-4f42-a0b3-0f9da774d47c", "metadata": {}, "outputs": [], "source": [ "f1(x) = x^3 + x^2 + x + 1 \n", "f1(z)" ] }, { "cell_type": "code", "execution_count": null, "id": "afc660a1-e410-4d64-bc2b-7649f093be17", "metadata": {}, "outputs": [], "source": [ "@btime f1($z);" ] }, { "cell_type": "code", "execution_count": null, "id": "c67364c0-d99d-49a4-a707-bf950c6c6ac1", "metadata": {}, "outputs": [], "source": [ "f1p = lambdify(diff(f1(q), q));\n", "f1p(z.val)" ] }, { "cell_type": "code", "execution_count": null, "id": "7a8cd046-ffa1-4cee-8c96-b3f20b363342", "metadata": {}, "outputs": [], "source": [ "@btime f1p($z.val);" ] }, { "cell_type": "code", "execution_count": null, "id": "50d9852d-e835-4d23-8ea1-03122f685933", "metadata": {}, "outputs": [], "source": [ "central(f1, z.val, sqrt(eps()))" ] }, { "cell_type": "code", "execution_count": null, "id": "518529ae-5756-4e38-a204-5f43e4547df8", "metadata": {}, "outputs": [], "source": [ "@btime central(f1, $z.val, sqrt(eps()));" ] }, { "cell_type": "markdown", "id": "346df7b3-0e62-44d0-bf82-9cc5a4adfe4b", "metadata": {}, "source": [ "Next, let's consider a derivative of a fraction:" ] }, { "cell_type": "code", "execution_count": null, "id": "f691eebc-92d0-4c47-8aa2-f7fcbf98195f", "metadata": {}, "outputs": [], "source": [ "f2(x) = 1 / (x + 1)\n", "f2(z)" ] }, { "cell_type": "code", "execution_count": null, "id": "627b4b97-af61-417d-b49f-f19c1e00c5a3", "metadata": {}, "outputs": [], "source": [ "@btime f2($z)" ] }, { "cell_type": "markdown", "id": "bdc33463-1121-4f31-a07c-22a0b6e09f6e", "metadata": {}, "source": [ "Now, we calculate the derivative of a function containing $\\sqrt{(\\ldots)}$:" ] }, { "cell_type": "code", "execution_count": null, "id": "5388e2ce-926f-48ff-8aab-e0393caff143", "metadata": {}, "outputs": [], "source": [ "f3(x) = sqrt(x + 1/x)\n", "f3(z)" ] }, { "cell_type": "code", "execution_count": null, "id": "29e235ac-3b49-43cd-808e-b7fa9fdfdb9c", "metadata": {}, "outputs": [], "source": [ "@btime f3($z)" ] }, { "cell_type": "code", "execution_count": null, "id": "41a1a3f3-ba8f-4bb5-a1a5-aaeee91e84e6", "metadata": {}, "outputs": [], "source": [ "f4(x) = sqrt(1 + x^3)/(x^(3/7) - sqrt(7)*x^(1/3))\n", "res4 = f4(z)" ] }, { "cell_type": "code", "execution_count": null, "id": "cc80612a-3e2c-4257-941f-5016af31ad0a", "metadata": {}, "outputs": [], "source": [ "@btime f4($z)" ] }, { "cell_type": "markdown", "id": "b955d0ad-68e4-4369-a4b6-f08c360b772a", "metadata": {}, "source": [ "The function definition does not have to be a single line:" ] }, { "cell_type": "code", "execution_count": null, "id": "d520b557-a826-47e1-8a5c-d981c741f4d4", "metadata": {}, "outputs": [], "source": [ "function f4a(x)\n", " a = x ^ 3\n", " b = x ^ (3/7)\n", " c = sqrt(7.0) * x ^ (1/3)\n", " return sqrt(1 + a)/(b - c)\n", "end\n", "\n", "res4a = f4a(z)" ] }, { "cell_type": "code", "execution_count": null, "id": "fee45b9f-a53e-4b87-ba58-a61cd10bfb51", "metadata": {}, "outputs": [], "source": [ "@btime f4a($z)" ] }, { "cell_type": "markdown", "id": "ff4bf305-66ef-43fd-bd31-f3ac0f0eb133", "metadata": {}, "source": [ "Let's check equality of res4 and res4a:" ] }, { "cell_type": "code", "execution_count": null, "id": "5139fc44-a466-4ded-bf09-5aecfacaeaf0", "metadata": {}, "outputs": [], "source": [ "res4.val ≈ res4a.val" ] }, { "cell_type": "code", "execution_count": null, "id": "2fb3213a-dd5b-4bbe-88d6-58c1e3807566", "metadata": {}, "outputs": [], "source": [ "res4.deriv ≈ res4a.deriv" ] }, { "cell_type": "markdown", "id": "bfe20595-9ee2-4a62-ba28-59aafd99d913", "metadata": {}, "source": [ "### What about non-algebraic functions?\n", "\n", "The method used so far will fail as soon as our function contains a non-algebraic element, such as a sine or an exponential. To fix the problem, we can just teach our computer some more basic derivatives. For instance, the derivative of a sine is a cosine. In the language of dual numbers, this reads:\n", "\n", "$$\\sin(u, u') = (\\sin(u), \\cos(u)u')$$" ] }, { "cell_type": "markdown", "id": "618d7faf-9d73-4a5a-b1bd-05cb450bb42c", "metadata": {}, "source": [ "So now, we only have to open our derivatives table and fill line by line, starting with the derivative of a sine, continuing with that of a cosine, a tangent, etc." ] }, { "cell_type": "code", "execution_count": null, "id": "559ef132-fb72-4ff3-bcf4-51e49e8180de", "metadata": {}, "outputs": [], "source": [ "import Base: sin, cos, exp\n", "\n", "exp(z::Dual) = Dual(exp(z.val), z.deriv * exp(z.val))\n", "sin(z::Dual) = Dual(sin(z.val), z.deriv * cos(z.val))\n", "cos(z::Dual) = Dual(cos(z.val),-z.deriv * sin(z.val));" ] }, { "cell_type": "markdown", "id": "b8d8afbf-40a7-4b97-bf64-038d40c6cb8b", "metadata": {}, "source": [ "We don’t even need to fill all the derivatives manually. For instance, the tangent is defined as:\n", "\n", "$$ \\tan(x) = \\frac{\\sin(x)}{\\cos(x)}$$\n", "\n", "and we already have automatically differentiable sine, cosine, and division in our arsenal. So this line will do the trick:" ] }, { "cell_type": "code", "execution_count": null, "id": "167037b5-4602-4e5b-a23f-abd18e72d3dc", "metadata": {}, "outputs": [], "source": [ "import Base: tan\n", "\n", "tan(z::Dual) = sin(z) / cos(z) # We can rely on previously defined functions!" ] }, { "cell_type": "markdown", "id": "90401413-767c-4b39-bd4b-9718c6cbdee4", "metadata": {}, "source": [ "### Example\n", "\n", "Let’s compute the derivative of the following non-algebraic function:\n", "\n", "$$f(x) = x + \\tan(\\sin^2(x) + \\cos^2(x))$$\n", "\n", "This is a complicated function. However, the derivative is 1 everywhere (notice that the argument of the tangent is actually constant). Now, using Dual:" ] }, { "cell_type": "code", "execution_count": null, "id": "4a1682df-9b34-4fe7-9aea-97bfd905d00e", "metadata": {}, "outputs": [], "source": [ "fun(x) = x + tan(cos(x)^2 + sin(x)^2)\n", "u = Dual(0, 1)\n", "fun(u)" ] }, { "cell_type": "markdown", "id": "a356d0e3-5bad-47b8-a002-536a0f7d7715", "metadata": {}, "source": [ "### Example: tangential lines\n", "\n", "Now we want to calculate and plot the tangential line of the following function at many values of x:\n", "\n", "$$f(x) = x^2 - 5x + 6 - 5x^3 - 5e^{-50x^2}$$ " ] }, { "cell_type": "code", "execution_count": null, "id": "2b91b6b3-2f83-48c7-9833-954fdccc7276", "metadata": {}, "outputs": [], "source": [ "f(x) = x^2 - 5 * x + 6 - 5 * x^3 - 5 * exp(-50 * x^2)" ] }, { "cell_type": "code", "execution_count": null, "id": "6640c81f-68d3-4953-8406-5b164f0cd2b4", "metadata": {}, "outputs": [], "source": [ "xmin = -0.7\n", "np = 200\n", "xx = [range(xmin, -xmin, np); range(-xmin, xmin, np)]; # back and forth\n", "xmin2 = -1.0\n", "x2 = range(xmin2, -xmin2, np)\n", "ff = f.(x2);" ] }, { "cell_type": "code", "execution_count": null, "id": "699cfc04-435c-4d43-b5c8-4ea4d709176d", "metadata": {}, "outputs": [], "source": [ "fig = figure()\n", "for a in xx\n", " plot(x2, ff, color=\"blue\") # plot the graph of f(x)\n", " plot(a, f(a), marker=\"o\", markersize=3, color=\"black\") # mark the point where the tangential line is drawn\n", "\n", " s = f(Dual(a, 1.0))\n", " L(x) = s.val + s.deriv * (x - a) # function of the tangential line at x = a\n", " plot([xmin2, -xmin2], [L(xmin2), L(-xmin2)], color=\"red\", linewidth=1.0) # the graph of a tangential line\n", " \n", " ylim(-5.0, 15.0)\n", " titl = @sprintf(\"a = % 5.3f\", a)\n", " title(titl, family=\"monospace\")\n", " xlabel(L\"$x$\")\n", " ylabel(L\"$f(x)$\")\n", " grid(true)\n", " \n", " display(fig)\n", " IJulia.clear_output(true)\n", " clf()\n", "end" ] }, { "cell_type": "markdown", "id": "f12c29ea-12ff-4343-972b-e5fb92c99987", "metadata": {}, "source": [ "### Example: solving nonlinear equations using Newton's method" ] }, { "cell_type": "markdown", "id": "6eed684f-0db0-4409-8ce6-4e4e07648e87", "metadata": {}, "source": [ "$$x_{n+1} = x_n - \\frac{f(x_n)}{f'(x_n)}$$" ] }, { "cell_type": "code", "execution_count": null, "id": "1b1ca6a7-0a11-4bfa-9b75-406677251d11", "metadata": {}, "outputs": [], "source": [ "\"\"\"\n", " root, niters = newton(f, x0)\n", "\n", "Find the root of the nonlinear equation f(x) = 0 using Newton's method. \n", "x_0 is initial approximation for the root.\n", "\"\"\"\n", "function newton(f, x0)\n", " x = x0\n", " delta = one(x)\n", " count = 0 # number of iterations\n", "\n", " while (abs(delta) > 10 * eps()) && count < 20 \n", " r = f(Dual(x, one(x)))\n", " delta = - r.val/r.deriv\n", " x += delta\n", " count += 1\n", " @show x, delta, count\n", " end\n", "\n", " return x, count\n", "end" ] }, { "cell_type": "markdown", "id": "b0a796b7-f3d1-414f-a1b0-40b6a5590fbb", "metadata": {}, "source": [ "Let's solve the following nonlinear equation: $$e^{-x} = x$$" ] }, { "cell_type": "code", "execution_count": null, "id": "abfa933d-c6db-4aec-bc35-c435d3f439f4", "metadata": {}, "outputs": [], "source": [ "g(x) = exp(-x) - x\n", "x0 = 1.0\n", "root, _ = newton(g, x0)" ] }, { "cell_type": "code", "execution_count": null, "id": "be459c62-363c-42e9-81b0-2641fd7da6c8", "metadata": {}, "outputs": [], "source": [ "x = range(0.0, 3.0, np)\n", "gg = exp.(-x)\n", "plot(x, gg, color=\"blue\", label=L\"$e^{-x}$\")\n", "plot(x, x, color=\"blue\", linestyle=\"dashed\", label=L\"x\")\n", "plot(root, exp(-root), marker=\"o\", markersize=4, color=\"red\")\n", "grid(true)\n", "xlabel(\"x\")\n", "legend()\n", "ylim(0.0, 1.0);" ] }, { "cell_type": "code", "execution_count": null, "id": "ed4e95d9-b81c-4472-a975-ded5de8b040a", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Julia 1.10.6", "language": "julia", "name": "julia-1.10" }, "language_info": { "file_extension": ".jl", "mimetype": "application/julia", "name": "julia", "version": "1.10.6" } }, "nbformat": 4, "nbformat_minor": 5 }