{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "![](../docs/banner.png)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Chapter 2: Loops & Functions" ] }, { "cell_type": "markdown", "metadata": { "toc": true }, "source": [ "

Chapter Outline

\n", "
\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Chapter Learning Objectives\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Write `for` and `while` loops in Python.\n", "- Identify iterable datatypes which can be used in `for` loops.\n", "- Create a `list`, `dictionary`, or `set` using comprehension.\n", "- Write a `try`/`except` statement.\n", "- Define a function and an anonymous function in Python.\n", "- Describe the difference between positional and keyword arguments.\n", "- Describe the difference between local and global arguments.\n", "- Apply the `DRY principle` to write modular code.\n", "- Assess whether a function has side effects.\n", "- Write a docstring for a function that describes parameters, return values, behaviour and usage." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. `for` Loops\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For loops allow us to execute code a specific number of times." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The number is 2 and its square is 4\n", "The number is 7 and its square is 49\n", "The number is -1 and its square is 1\n", "The number is 5 and its square is 25\n", "I'm outside the loop!\n" ] } ], "source": [ "for n in [2, 7, -1, 5]:\n", " print(f\"The number is {n} and its square is {n**2}\")\n", "print(\"I'm outside the loop!\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The main points to notice:\n", "\n", "* Keyword `for` begins the loop. Colon `:` ends the first line of the loop.\n", "* Block of code indented is executed for each value in the list (hence the name \"for\" loops)\n", "* The loop ends after the variable `n` has taken all the values in the list\n", "* We can iterate over any kind of \"iterable\": `list`, `tuple`, `range`, `set`, `string`.\n", "* An iterable is really just any object with a sequence of values that can be looped over. In this case, we are iterating over the values in a list." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Gimme a P!\n", "Gimme a y!\n", "Gimme a t!\n", "Gimme a h!\n", "Gimme a o!\n", "Gimme a n!\n", "What's that spell?!! Python!\n" ] } ], "source": [ "word = \"Python\"\n", "for letter in word:\n", " print(\"Gimme a \" + letter + \"!\")\n", "\n", "print(f\"What's that spell?!! {word}!\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A very common pattern is to use `for` with the `range()`. `range()` gives you a sequence of integers up to some value (non-inclusive of the end-value) and is typically used for looping." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "range(0, 10)" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "range(10)" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list(range(10))" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0\n", "1\n", "2\n", "3\n", "4\n", "5\n", "6\n", "7\n", "8\n", "9\n" ] } ], "source": [ "for i in range(10):\n", " print(i)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also specify a start value and a skip-by value with `range`:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1\n", "11\n", "21\n", "31\n", "41\n", "51\n", "61\n", "71\n", "81\n", "91\n" ] } ], "source": [ "for i in range(1, 101, 10):\n", " print(i)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can write a loop inside another loop to iterate over multiple dimensions of data:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(1, 'a')\n", "(1, 'b')\n", "(1, 'c')\n", "(2, 'a')\n", "(2, 'b')\n", "(2, 'c')\n", "(3, 'a')\n", "(3, 'b')\n", "(3, 'c')\n" ] } ], "source": [ "for x in [1, 2, 3]:\n", " for y in [\"a\", \"b\", \"c\"]:\n", " print((x, y))" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0 a\n", "1 b\n", "2 c\n" ] } ], "source": [ "list_1 = [0, 1, 2]\n", "list_2 = [\"a\", \"b\", \"c\"]\n", "for i in range(3):\n", " print(list_1[i], list_2[i])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There are many clever ways of doing these kinds of things in Python. When looping over objects, I tend to use `zip()` and `enumerate()` quite a lot in my work. `zip()` returns a zip object which is an iterable of tuples." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(0, 'a')\n", "(1, 'b')\n", "(2, 'c')\n" ] } ], "source": [ "for i in zip(list_1, list_2):\n", " print(i)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can even \"unpack\" these tuples directly in the `for` loop:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0 a\n", "1 b\n", "2 c\n" ] } ], "source": [ "for i, j in zip(list_1, list_2):\n", " print(i, j)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`enumerate()` adds a counter to an iterable which we can use within the loop." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(0, 'a')\n", "(1, 'b')\n", "(2, 'c')\n" ] } ], "source": [ "for i in enumerate(list_2):\n", " print(i)" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "index 0, value a\n", "index 1, value b\n", "index 2, value c\n" ] } ], "source": [ "for n, i in enumerate(list_2):\n", " print(f\"index {n}, value {i}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can loop through key-value pairs of a dictionary using `.items()`. The general syntax is `for key, value in dictionary.items()`." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "DSCI 521, is awesome\n", "DSCI 551, is riveting\n", "DSCI 511, is naptime!\n" ] } ], "source": [ "courses = {521 : \"awesome\",\n", " 551 : \"riveting\",\n", " 511 : \"naptime!\"}\n", "\n", "for course_num, description in courses.items():\n", " print(f\"DSCI {course_num}, is {description}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can even use `enumerate()` to do more complex un-packing:" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Item 0: DSCI 521, is awesome\n", "Item 1: DSCI 551, is riveting\n", "Item 2: DSCI 511, is naptime!\n" ] } ], "source": [ "for n, (course_num, description) in enumerate(courses.items()):\n", " print(f\"Item {n}: DSCI {course_num}, is {description}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. `while` loops\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also use a [`while` loop](https://docs.python.org/3/reference/compound_stmts.html#while) to excute a block of code several times. But beware! If the conditional expression is always `True`, then you've got an infintite loop! " ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "10\n", "9\n", "8\n", "7\n", "6\n", "5\n", "4\n", "3\n", "2\n", "1\n", "Blast off!\n" ] } ], "source": [ "n = 10\n", "while n > 0:\n", " print(n)\n", " n -= 1\n", "\n", "print(\"Blast off!\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's read the `while` statement above as if it were in English. It means, “While `n` is greater than 0, display the value of `n` and then decrement `n` by 1. When you get to 0, display the word Blast off!”\n", "\n", "For some loops, it's hard to tell when, or if, they will stop! Take a look at the [Collatz conjecture](https://en.wikipedia.org/wiki/Collatz_conjecture). The conjecture states that no matter what positive integer `n` we start with, the sequence will always eventually reach 1 - we just don't know how many iterations it will take." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "11\n", "34\n", "17\n", "52\n", "26\n", "13\n", "40\n", "20\n", "10\n", "5\n", "16\n", "8\n", "4\n", "2\n", "1\n" ] } ], "source": [ "n = 11\n", "while n != 1:\n", " print(int(n))\n", " if n % 2 == 0: # n is even\n", " n = n / 2\n", " else: # n is odd\n", " n = n * 3 + 1\n", "print(int(n))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Hence, in some cases, you may want to force a `while` loop to stop based on some criteria, using the `break` keyword." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "123\n", "370\n", "185\n", "556\n", "278\n", "139\n", "418\n", "209\n", "628\n", "314\n", "Ugh, too many iterations!\n" ] } ], "source": [ "n = 123\n", "i = 0\n", "while n != 1:\n", " print(int(n))\n", " if n % 2 == 0: # n is even\n", " n = n / 2\n", " else: # n is odd\n", " n = n * 3 + 1\n", " i += 1\n", " if i == 10:\n", " print(f\"Ugh, too many iterations!\")\n", " break" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `continue` keyword is similar to `break` but won't stop the loop. Instead, it just restarts the loop from the top." ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "10\n", "8\n", "6\n", "4\n", "2\n", "Blast off!\n" ] } ], "source": [ "n = 10\n", "while n > 0:\n", " if n % 2 != 0: # n is odd\n", " n = n - 1\n", " continue\n", " break # this line is never executed because continue restarts the loop from the top\n", " print(n)\n", " n = n - 1\n", "\n", "print(\"Blast off!\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3. Comprehensions\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Comprehensions allow us to build lists/tuples/sets/dictionaries in one convenient, compact line of code. I use these quite a bit! Below is a standard `for` loop you might use to iterate over an iterable and create a list:" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['T', 'i', 'm', 'e', 't', 'o', 'l', 'e', 'a', 'r', 'n', '!']\n" ] } ], "source": [ "subliminal = ['Tom', 'ingests', 'many', 'eggs', 'to', 'outrun', 'large', 'eagles', 'after', 'running', 'near', '!']\n", "first_letters = []\n", "for word in subliminal:\n", " first_letters.append(word[0])\n", "print(first_letters)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "List comprehension allows us to do this in one compact line:" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "['T', 'i', 'm', 'e', 't', 'o', 'l', 'e', 'a', 'r', 'n', '!']" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "letters = [word[0] for word in subliminal] # list comprehension\n", "letters" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can make things more complicated by doing multiple iteration or conditional iteration:" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[(0, 0),\n", " (0, 1),\n", " (0, 2),\n", " (0, 3),\n", " (1, 0),\n", " (1, 1),\n", " (1, 2),\n", " (1, 3),\n", " (2, 0),\n", " (2, 1),\n", " (2, 2),\n", " (2, 3)]" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[(i, j) for i in range(3) for j in range(4)]" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[0, 2, 4, 6, 8, 10]" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[i for i in range(11) if i % 2 == 0] # condition the iterator, select only even numbers" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[0, -1, 2, -3, 4, -5, 6, -7, 8, -9, 10]" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[-i if i % 2 else i for i in range(11)] # condition the value, -ve odd and +ve even numbers" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There is also set comprehension:" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'e', 'm', 'o'}" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "words = ['hello', 'goodbye', 'the', 'antidisestablishmentarianism']\n", "y = {word[-1] for word in words} # set comprehension\n", "y # only has 3 elements because a set contains only unique items and there would have been two e's" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Dictionary comprehension:" ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "{'hello': 5, 'goodbye': 7, 'the': 3, 'antidisestablishmentarianism': 28}" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "word_lengths = {word:len(word) for word in words} # dictionary comprehension\n", "word_lengths" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Tuple comprehension doesn't work as you might expect... We get a \"generator\" instead (more on that later)." ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " at 0x15e469c50>\n" ] } ], "source": [ "y = (word[-1] for word in words) # this is NOT a tuple comprehension - more on generators later\n", "print(y)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 4. `try` / `except`\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![](img/chapter2/bsod.jpg)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Above: the [Blue Screen of Death](https://en.wikipedia.org/wiki/Blue_Screen_of_Death) at a Nine Inch Nails concert! Source: [cnet.com](https://www.cnet.com/news/nine-inch-nails-depresses-with-a-big-blue-screen-of-death/)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If something goes wrong, we don't want our code to crash - we want it to **fail gracefully**. In Python, this can be accomplished using `try`/`except`. Here is a basic example:" ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "tags": [ "raises-exception" ] }, "outputs": [ { "ename": "NameError", "evalue": "name 'this_variable_does_not_exist' is not defined", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mthis_variable_does_not_exist\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 2\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"Another line\"\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;31m# code fails before getting to this line\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;31mNameError\u001b[0m: name 'this_variable_does_not_exist' is not defined" ] } ], "source": [ "this_variable_does_not_exist\n", "print(\"Another line\") # code fails before getting to this line" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "You did something bad! But I won't raise an error.\n", "Another line\n" ] } ], "source": [ "try:\n", " this_variable_does_not_exist\n", "except:\n", " pass # do nothing\n", " print(\"You did something bad! But I won't raise an error.\") # print something\n", "print(\"Another line\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Python tries to execute the code in the `try` block. If an error is encountered, we \"catch\" this in the `except` block (also called `try`/`catch` in other languages). There are many different error types, or **exceptions** - we saw `NameError` above. " ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "tags": [ "raises-exception" ] }, "outputs": [ { "ename": "ZeroDivisionError", "evalue": "division by zero", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mZeroDivisionError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0;36m5\u001b[0m\u001b[0;34m/\u001b[0m\u001b[0;36m0\u001b[0m \u001b[0;31m# ZeroDivisionError\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mZeroDivisionError\u001b[0m: division by zero" ] } ], "source": [ "5/0 # ZeroDivisionError" ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "tags": [ "raises-exception" ] }, "outputs": [ { "ename": "IndexError", "evalue": "list index out of range", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mIndexError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0mmy_list\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m[\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m2\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m3\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0mmy_list\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m5\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;31m# IndexError\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mIndexError\u001b[0m: list index out of range" ] } ], "source": [ "my_list = [1,2,3]\n", "my_list[5] # IndexError" ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "tags": [ "raises-exception" ] }, "outputs": [ { "ename": "TypeError", "evalue": "'tuple' object does not support item assignment", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0mmy_tuple\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m2\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m3\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0mmy_tuple\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;36m0\u001b[0m \u001b[0;31m# TypeError\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mTypeError\u001b[0m: 'tuple' object does not support item assignment" ] } ], "source": [ "my_tuple = (1,2,3)\n", "my_tuple[0] = 0 # TypeError" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Ok, so there are apparently a bunch of different errors one could run into. With `try`/`except` you can also catch the exception itself:" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "You did something bad!\n", "name 'this_variable_does_not_exist' is not defined\n", "\n" ] } ], "source": [ "try:\n", " this_variable_does_not_exist\n", "except Exception as ex:\n", " print(\"You did something bad!\")\n", " print(ex)\n", " print(type(ex))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the above, we caught the exception and assigned it to the variable `ex` so that we could print it out. This is useful because you can see what the error message would have been, without crashing your program. You can also catch specific exceptions types. This is typically the recommended way to catch errors, you want to be specific in catching your error so you know exactly where and why your code failed." ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "You made a name error!\n" ] } ], "source": [ "try:\n", " this_variable_does_not_exist # name error\n", "# (1, 2, 3)[0] = 1 # type error\n", "# 5/0 # ZeroDivisionError\n", "except TypeError:\n", " print(\"You made a type error!\")\n", "except NameError:\n", " print(\"You made a name error!\")\n", "except:\n", " print(\"You made some other sort of error\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The final `except` would trigger if the error is none of the above types, so this sort of has an `if`/`elif`/`else` feel to it. There is also an optional `else` and `finally` keyword (which I almost never used), but you can read more about [here](https://docs.python.org/3/tutorial/errors.html)." ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The variable does not exist!\n", "I'm printing anyway!\n" ] } ], "source": [ "try:\n", " this_variable_does_not_exist\n", "except:\n", " print(\"The variable does not exist!\")\n", "finally:\n", " print(\"I'm printing anyway!\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also write code that raises an exception on purpose, using `raise`:" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [], "source": [ "def add_one(x): # we'll get to functions in the next section\n", " return x + 1" ] }, { "cell_type": "code", "execution_count": 39, "metadata": { "tags": [ "raises-exception" ] }, "outputs": [ { "ename": "TypeError", "evalue": "can only concatenate str (not \"int\") to str", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0madd_one\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"blah\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;32m\u001b[0m in \u001b[0;36madd_one\u001b[0;34m(x)\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0madd_one\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mx\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;31m# we'll get to functions in the next section\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0mx\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mTypeError\u001b[0m: can only concatenate str (not \"int\") to str" ] } ], "source": [ "add_one(\"blah\")" ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [], "source": [ "def add_one(x):\n", " if not isinstance(x, float) and not isinstance(x, int):\n", " raise TypeError(f\"Sorry, x must be numeric, you entered a {type(x)}.\")\n", " \n", " return x + 1" ] }, { "cell_type": "code", "execution_count": 41, "metadata": { "tags": [ "raises-exception" ] }, "outputs": [ { "ename": "TypeError", "evalue": "Sorry, x must be numeric, you entered a .", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0madd_one\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"blah\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;32m\u001b[0m in \u001b[0;36madd_one\u001b[0;34m(x)\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0madd_one\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mx\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0misinstance\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mx\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mfloat\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;32mand\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0misinstance\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mx\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mint\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 3\u001b[0;31m \u001b[0;32mraise\u001b[0m \u001b[0mTypeError\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34mf\"Sorry, x must be numeric, you entered a {type(x)}.\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 4\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0mx\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;31mTypeError\u001b[0m: Sorry, x must be numeric, you entered a ." ] } ], "source": [ "add_one(\"blah\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is useful when your function is complicated and would fail in a complicated way, with a weird error message. You can make the cause of the error much clearer to the _user_ of the function. If you do this, you should ideally describe these exceptions in the function documentation, so a user knows what to expect if they call your function.\n", "\n", "Finally, we can even define our own exception types. We do this by inheriting from the `Exception` class - we'll explore classes and inheritance more in the next chapter!" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "class CustomAdditionError(Exception):\n", " pass" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "def add_one(x):\n", " if not isinstance(x, float) and not isinstance(x, int):\n", " raise CustomAdditionError(\"Sorry, x must be numeric\")\n", " \n", " return x + 1" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "tags": [ "raises-exception" ] }, "outputs": [ { "ename": "CustomAdditionError", "evalue": "Sorry, x must be numeric", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mCustomAdditionError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0madd_one\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"blah\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;32m\u001b[0m in \u001b[0;36madd_one\u001b[0;34m(x)\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0madd_one\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mx\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0misinstance\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mx\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mfloat\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;32mand\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0misinstance\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mx\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mint\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 3\u001b[0;31m \u001b[0;32mraise\u001b[0m \u001b[0mCustomAdditionError\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"Sorry, x must be numeric\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 4\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0mx\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;31mCustomAdditionError\u001b[0m: Sorry, x must be numeric" ] } ], "source": [ "add_one(\"blah\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 5. Functions\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A [function](https://docs.python.org/3/tutorial/controlflow.html#defining-functions) is a reusable piece of code that can accept input parameters, also known as \"arguments\". For example, let's define a function called `square` which takes one input parameter `n` and returns the square `n**2`:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "def square(n):\n", " n_squared = n**2\n", " return n_squared" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "4" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "square(2)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "10000" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "square(100)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "152399025" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "square(12345)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Functions begin with the `def` keyword, then the function name, arguments in parentheses, and then a colon (`:`). The code executed by the function is defined by indentation. The output or \"return\" value of the function is specified using the `return` keyword." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Side Effects & Local Variables\n", "\n", "When you create a variable inside a function, it is local, which means that it only exists inside the function. For example:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "def cat_string(str1, str2):\n", " string = str1 + str2\n", " return string" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'My name is Tom'" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cat_string('My name is ', 'Tom')" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "tags": [ "raises-exception" ] }, "outputs": [ { "ename": "NameError", "evalue": "name 'string' is not defined", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mstring\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mNameError\u001b[0m: name 'string' is not defined" ] } ], "source": [ "string" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If a function changes the variables passed into it, then it is said to have **side effects**. For example:" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "def silly_sum(my_list):\n", " my_list.append(0)\n", " return sum(my_list)" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "10" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "l = [1, 2, 3, 4]\n", "out = silly_sum(l)\n", "out" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The above looks like what we wanted? But wait... it changed our `l` object..." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[1, 2, 3, 4, 0]" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "l" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If your function has side effects like this, you must mention it in the documentation (which we'll touch on later in this chapter)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Null Return Type\n", "\n", "If you do not specify a return value, the function returns `None` when it terminates:" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "None\n" ] } ], "source": [ "def f(x):\n", " x + 1 # no return!\n", " if x == 999:\n", " return\n", "print(f(0))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Optional & Required Arguments\n", "\n", "Sometimes it is convenient to have _default values_ for some arguments in a function. Because they have default values, these arguments are optional, and are hence called \"optional arguments\". For example:" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "def repeat_string(s, n=2):\n", " return s*n" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'mdsmds'" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "repeat_string(\"mds\", 2)" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'mdsmdsmdsmdsmds'" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "repeat_string(\"mds\", 5)" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'mdsmds'" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "repeat_string(\"mds\") # do not specify `n`; it is optional" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Ideally, the default value for optional arguments should be carefully chosen. In the function above, the idea of \"repeating\" something makes me think of having 2 copies, so `n=2` feels like a reasonable default.\n", "\n", "You can have any number of required arguments and any number of optional arguments. All the optional arguments must come after the required arguments. The required arguments are mapped by the order they appear. The optional arguments can be specified out of order when using the function." ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1 2 3 4\n" ] } ], "source": [ "def example(a, b, c=\"DEFAULT\", d=\"DEFAULT\"):\n", " print(a, b, c, d)\n", " \n", "example(1, 2, 3, 4)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Using the defaults for `c` and `d`:" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1 2 DEFAULT DEFAULT\n" ] } ], "source": [ "example(1, 2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Specifying `c` and `d` as **keyword arguments** (i.e. by name):" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1 2 3 4\n" ] } ], "source": [ "example(1, 2, c=3, d=4)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Specifying only one of the optional arguments, by keyword:" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1 2 3 DEFAULT\n" ] } ], "source": [ "example(1, 2, c=3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Specifying all the arguments as keyword arguments, even though only `c` and `d` are optional:" ] }, { "cell_type": "code", "execution_count": 145, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1 2 3 4\n" ] } ], "source": [ "example(a=1, b=2, c=3, d=4)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Specifying `c` by the fact that it comes 3rd (I do not recommend this because I find it is confusing):" ] }, { "cell_type": "code", "execution_count": 146, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1 2 3 DEFAULT\n" ] } ], "source": [ "example(1, 2, 3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Specifying the optional arguments by keyword, but in the wrong order (this can also be confusing, but not so terrible - I am fine with it):" ] }, { "cell_type": "code", "execution_count": 148, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1 2 3 4\n" ] } ], "source": [ "example(1, 2, d=4, c=3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Specifying the non-optional arguments by keyword (I am fine with this):" ] }, { "cell_type": "code", "execution_count": 149, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1 2 DEFAULT DEFAULT\n" ] } ], "source": [ "example(a=1, b=2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Specifying the non-optional arguments by keyword, but in the wrong order (not recommended, I find it confusing):" ] }, { "cell_type": "code", "execution_count": 150, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1 2 DEFAULT DEFAULT\n" ] } ], "source": [ "example(b=2, a=1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Specifying keyword arguments before non-keyword arguments (this throws an error):" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "tags": [ "raises-exception" ] }, "outputs": [ { "ename": "SyntaxError", "evalue": "positional argument follows keyword argument (, line 1)", "output_type": "error", "traceback": [ "\u001b[0;36m File \u001b[0;32m\"\"\u001b[0;36m, line \u001b[0;32m1\u001b[0m\n\u001b[0;31m example(a=2, 1)\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m positional argument follows keyword argument\n" ] } ], "source": [ "example(a=2, 1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Multiple Return Values\n", "\n", "In many programming languages, functions can only return one object. That is technically true in Python too, but there is a \"workaround\", which is to return a tuple." ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [], "source": [ "def sum_and_product(x, y):\n", " return (x + y, x * y)" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(11, 30)" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sum_and_product(5, 6)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The parentheses can be omitted (and often are), and a `tuple` is implicitly returned as defined by the use of the comma: " ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [], "source": [ "def sum_and_product(x, y):\n", " return x + y, x * y" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(11, 30)" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sum_and_product(5, 6)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is common to immediately unpack a returned tuple into separate variables, so it really feels like the function is returning multiple values:" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [], "source": [ "s, p = sum_and_product(5, 6)" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "11" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "30" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "p" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As an aside, it is conventional in Python to use `_` for values you don't want:" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [], "source": [ "s, _ = sum_and_product(5, 6)" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "11" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "30" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "_" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Functions with Arbitrary Number of Arguments\n", "\n", "You can also call/define functions that accept an arbitrary number of positional or keyword arguments using `*args` and `**kwargs`." ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [], "source": [ "def add(*args):\n", " print(args)\n", " return sum(args)" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(1, 2, 3, 4, 5, 6)\n" ] }, { "data": { "text/plain": [ "21" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "add(1, 2, 3, 4, 5, 6)" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [], "source": [ "def add(**kwargs):\n", " print(kwargs)\n", " return sum(kwargs.values())" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'a': 3, 'b': 4, 'c': 5}\n" ] }, { "data": { "text/plain": [ "12" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "add(a=3, b=4, c=5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 6. Functions as a Data Type\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In Python, functions are actually a data type:" ] }, { "cell_type": "code", "execution_count": 38, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "def do_nothing(x):\n", " return x" ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "function" ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(do_nothing)" ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "print(do_nothing)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This means you can pass functions as arguments into other functions." ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [], "source": [ "def square(y):\n", " return y**2\n", "\n", "def evaluate_function_on_x_plus_1(fun, x):\n", " return fun(x+1)" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "36" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "evaluate_function_on_x_plus_1(square, 5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So what happened above?\n", "- `fun(x+1)` becomes `square(5+1)`\n", "- `square(6)` becomes `36`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 7. Anonymous Functions\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There are two ways to define functions in Python. The way we've beenusing up until now:" ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [], "source": [ "def add_one(x):\n", " return x+1" ] }, { "cell_type": "code", "execution_count": 44, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "8.2" ] }, "execution_count": 44, "metadata": {}, "output_type": "execute_result" } ], "source": [ "add_one(7.2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Or by using the `lambda` keyword:" ] }, { "cell_type": "code", "execution_count": 45, "metadata": {}, "outputs": [], "source": [ "add_one = lambda x: x+1 " ] }, { "cell_type": "code", "execution_count": 46, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "function" ] }, "execution_count": 46, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(add_one)" ] }, { "cell_type": "code", "execution_count": 47, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "8.2" ] }, "execution_count": 47, "metadata": {}, "output_type": "execute_result" } ], "source": [ "add_one(7.2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The two approaches above are identical. The one with `lambda` is called an **anonymous function**. Anonymous functions can only take up one line of code, so they aren't appropriate in most cases, but can be useful for smaller things." ] }, { "cell_type": "code", "execution_count": 48, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "36" ] }, "execution_count": 48, "metadata": {}, "output_type": "execute_result" } ], "source": [ "evaluate_function_on_x_plus_1(lambda x: x ** 2, 5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Above:\n", "\n", "- First, `lambda x: x**2` evaluates to a value of type `function` (otice that this function is never given a name - hence \"anonymous functions\").\n", "- Then, the function and the integer `5` are passed into `evaluate_function_on_x_plus_1`\n", "- At which point the anonymous function is evaluated on `5+1`, and we get `36`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 8. DRY Principle, Designing Good Functions\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "DRY stands for **Don't Repeat Yourself**. See the relevant [Wikipedia article](https://en.wikipedia.org/wiki/Don%27t_repeat_yourself) for more about this principle.\n", "\n", "As an example, consider the task of turning each element of a list into a palindrome." ] }, { "cell_type": "code", "execution_count": 49, "metadata": {}, "outputs": [], "source": [ "names = [\"milad\", \"tom\", \"tiffany\"]" ] }, { "cell_type": "code", "execution_count": 50, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'mot'" ] }, "execution_count": 50, "metadata": {}, "output_type": "execute_result" } ], "source": [ "name = \"tom\"\n", "name[::-1] # creates a slice that starts at the end and moves backwards, syntax is [begin:end:step]" ] }, { "cell_type": "code", "execution_count": 51, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['miladdalim', 'tommot', 'tiffanyynaffit']" ] }, "execution_count": 51, "metadata": {}, "output_type": "execute_result" } ], "source": [ "names_backwards = list()\n", "\n", "names_backwards.append(names[0] + names[0][::-1])\n", "names_backwards.append(names[1] + names[1][::-1])\n", "names_backwards.append(names[2] + names[2][::-1])\n", "names_backwards" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The code above is gross, terrible, yucky code for several reasons:\n", "1. It only works for a list with 3 elements;\n", "2. It only works for a list named `names`;\n", "3. If we want to change its functionality, we need to change 3 similar lines of code (Don't Repeat Yourself!!);\n", "4. It is hard to understand what it does just by looking at it.\n", "\n", "Let's try this a different way:" ] }, { "cell_type": "code", "execution_count": 53, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['miladdalim', 'tommot', 'tiffanyynaffit']" ] }, "execution_count": 53, "metadata": {}, "output_type": "execute_result" } ], "source": [ "names_backwards = list()\n", "\n", "for name in names:\n", " names_backwards.append(name + name[::-1])\n", " \n", "names_backwards" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The above is slightly better and we have solved problems (1) and (3). But let's create a function to make our life easier:" ] }, { "cell_type": "code", "execution_count": 54, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['miladdalim', 'tommot', 'tiffanyynaffit']" ] }, "execution_count": 54, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def make_palindromes(names):\n", " names_backwards = list()\n", " \n", " for name in names:\n", " names_backwards.append(name + name[::-1])\n", " \n", " return names_backwards\n", "\n", "make_palindromes(names)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Okay, this is even better. We have now also solved problem (2), because you can call the function with any list, not just `names`. For example, what if we had multiple _lists_:" ] }, { "cell_type": "code", "execution_count": 55, "metadata": {}, "outputs": [], "source": [ "names1 = [\"milad\", \"tom\", \"tiffany\"]\n", "names2 = [\"apple\", \"orange\", \"banana\"]" ] }, { "cell_type": "code", "execution_count": 56, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['miladdalim', 'tommot', 'tiffanyynaffit']" ] }, "execution_count": 56, "metadata": {}, "output_type": "execute_result" } ], "source": [ "make_palindromes(names1)" ] }, { "cell_type": "code", "execution_count": 57, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['appleelppa', 'orangeegnaro', 'bananaananab']" ] }, "execution_count": 57, "metadata": {}, "output_type": "execute_result" } ], "source": [ "make_palindromes(names2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Designing Good Functions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "How far you go and how you choose to apply the DRY principle is up to you and the programming context. These decisions are often ambiguous. Should `make_palindromes()` be a function if I'm only ever doing it once? Twice? Should the loop be inside the function, or outside? Should there be TWO functions, one that loops over the other?\n", "\n", "In my personal opinion, `make_palindromes()` does a bit too much to be understandable. I prefer this:" ] }, { "cell_type": "code", "execution_count": 58, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'miladdalim'" ] }, "execution_count": 58, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def make_palindrome(name):\n", " return name + name[::-1]\n", "\n", "make_palindrome(\"milad\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "From here, if we want to \"apply `make_palindrome` to every element of a list\" we could use list comprehension:" ] }, { "cell_type": "code", "execution_count": 59, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['miladdalim', 'tommot', 'tiffanyynaffit']" ] }, "execution_count": 59, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[make_palindrome(name) for name in names]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There is also the in-built `map()` function which does exactly this, applies a function to every element of a sequence:" ] }, { "cell_type": "code", "execution_count": 60, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['miladdalim', 'tommot', 'tiffanyynaffit']" ] }, "execution_count": 60, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list(map(make_palindrome, names))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 9. Generators\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Recall list comprehension from earlier in the chapter:" ] }, { "cell_type": "code", "execution_count": 61, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]" ] }, "execution_count": 61, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[n for n in range(10)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Comprehensions evaluate the entire expression at once, and then returns the full data product. Sometimes, we want to work with just one part of our data at a time, for example, when we can't fit all of our data in memory. For this, we can use *generators*." ] }, { "cell_type": "code", "execution_count": 62, "metadata": {}, "outputs": [ { "data": { "text/plain": [ " at 0x110220650>" ] }, "execution_count": 62, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(n for n in range(10))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notice that we just created a `generator object`. Generator objects are like a \"recipe\" for generating values. They don't actually do any computation until they are asked to. We can get values from a generator in three main ways:\n", "- Using `next()`\n", "- Using `list()`\n", "- Looping" ] }, { "cell_type": "code", "execution_count": 63, "metadata": {}, "outputs": [], "source": [ "gen = (n for n in range(10))" ] }, { "cell_type": "code", "execution_count": 64, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0" ] }, "execution_count": 64, "metadata": {}, "output_type": "execute_result" } ], "source": [ "next(gen)" ] }, { "cell_type": "code", "execution_count": 65, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1" ] }, "execution_count": 65, "metadata": {}, "output_type": "execute_result" } ], "source": [ "next(gen)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Once the generator is exhausted, it will no longer return values:" ] }, { "cell_type": "code", "execution_count": 67, "metadata": { "tags": [ "raises-exception" ] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0\n", "1\n", "2\n", "3\n", "4\n", "5\n", "6\n", "7\n", "8\n", "9\n" ] }, { "ename": "StopIteration", "evalue": "", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mStopIteration\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0mgen\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0mn\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0mn\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mrange\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m10\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0mi\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mrange\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m11\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 3\u001b[0;31m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mnext\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mgen\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mStopIteration\u001b[0m: " ] } ], "source": [ "gen = (n for n in range(10))\n", "for i in range(11):\n", " print(next(gen))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can see all the values of a generator using `list()` but this defeats the purpose of using a generator in the first place:" ] }, { "cell_type": "code", "execution_count": 68, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]" ] }, "execution_count": 68, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gen = (n for n in range(10))\n", "list(gen)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, we can loop over generator objects too:" ] }, { "cell_type": "code", "execution_count": 69, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0\n", "1\n", "2\n", "3\n", "4\n", "5\n", "6\n", "7\n", "8\n", "9\n" ] } ], "source": [ "gen = (n for n in range(10))\n", "for i in gen:\n", " print(i)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Above, we saw how to create a generator object using comprehension syntax but with parentheses. We can also create a generator using functions and the `yield` keyword (instead of the `return` keyword):" ] }, { "cell_type": "code", "execution_count": 70, "metadata": {}, "outputs": [], "source": [ "def gen():\n", " for n in range(10):\n", " yield (n, n ** 2)" ] }, { "cell_type": "code", "execution_count": 71, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(0, 0)\n", "(1, 1)\n", "(2, 4)\n" ] } ], "source": [ "g = gen()\n", "print(next(g))\n", "print(next(g))\n", "print(next(g))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Below is some real-world motivation of a case where a generator might be useful. Say we want to create a list of dictionaries containing information about houses in Canada." ] }, { "cell_type": "code", "execution_count": 72, "metadata": {}, "outputs": [], "source": [ "import random # we'll learn about imports in a later chapter\n", "import time\n", "import memory_profiler\n", "city = ['Vancouver', 'Toronto', 'Ottawa', 'Montreal']" ] }, { "cell_type": "code", "execution_count": 73, "metadata": {}, "outputs": [], "source": [ "def house_list(n):\n", " houses = []\n", " for i in range(n):\n", " house = {\n", " 'id': i,\n", " 'city': random.choice(city),\n", " 'bedrooms': random.randint(1, 5),\n", " 'bathrooms': random.randint(1, 3),\n", " 'price ($1000s)': random.randint(300, 1000)\n", " }\n", " houses.append(house)\n", " return houses" ] }, { "cell_type": "code", "execution_count": 74, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[{'id': 0,\n", " 'city': 'Ottawa',\n", " 'bedrooms': 5,\n", " 'bathrooms': 2,\n", " 'price ($1000s)': 420},\n", " {'id': 1,\n", " 'city': 'Montreal',\n", " 'bedrooms': 5,\n", " 'bathrooms': 1,\n", " 'price ($1000s)': 652}]" ] }, "execution_count": 74, "metadata": {}, "output_type": "execute_result" } ], "source": [ "house_list(2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "What happens if we want to create a list of 1,000,000 houses? How much time/memory will it take?" ] }, { "cell_type": "code", "execution_count": 75, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Memory usage before: 86 mb\n", "Memory usage after: 251 mb\n", "Time taken: 2.24s\n" ] } ], "source": [ "start = time.time()\n", "mem = memory_profiler.memory_usage()\n", "print(f\"Memory usage before: {mem[0]:.0f} mb\")\n", "people = house_list(500000)\n", "print(f\"Memory usage after: {memory_profiler.memory_usage()[0]:.0f} mb\")\n", "print(f\"Time taken: {time.time() - start:.2f}s\")" ] }, { "cell_type": "code", "execution_count": 76, "metadata": {}, "outputs": [], "source": [ "def house_generator(n):\n", " for i in range(n):\n", " house = {\n", " 'id': i,\n", " 'city': random.choice(city),\n", " 'bedrooms': random.randint(1, 5),\n", " 'bathrooms': random.randint(1, 3),\n", " 'price ($1000s)': random.randint(300, 1000)\n", " }\n", " yield house" ] }, { "cell_type": "code", "execution_count": 77, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Memory usage before: 86 mb\n", "Memory usage after: 89 mb\n", "Time taken: 0.17s\n" ] } ], "source": [ "start = time.time()\n", "print(f\"Memory usage before: {mem[0]:.0f} mb\")\n", "people = house_generator(500000)\n", "print(f\"Memory usage after: {memory_profiler.memory_usage()[0]:.0f} mb\")\n", "print(f\"Time taken: {time.time() - start:.2f}s\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Although, if we used `list()` to extract all of the genertator values, we'd lose our memory savings:" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Memory usage before: 36 mb\n", "Memory usage after: 202 mb\n" ] } ], "source": [ "print(f\"Memory usage before: {mem[0]:.0f} mb\")\n", "people = list(house_generator(500000))\n", "print(f\"Memory usage after: {memory_profiler.memory_usage()[0]:.0f} mb\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 10. Docstrings\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "One problem we never really solved when talking about writing good functions was: \"4. It is hard to understand what it does just by looking at it\". This brings up the idea of function documentation, called \"docstrings\". The [docstring](https://www.python.org/dev/peps/pep-0257/) goes right after the `def` line and is wrapped in **triple quotes** `\"\"\"`." ] }, { "cell_type": "code", "execution_count": 78, "metadata": {}, "outputs": [], "source": [ "def make_palindrome(string):\n", " \"\"\"Turns the string into a palindrome by concatenating itself with a reversed version of itself.\"\"\"\n", " \n", " return string + string[::-1]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In Python we can use the `help()` function to view another function's documentation. In IPython/Jupyter, we can use `?` to view the documentation string of any function in our environment." ] }, { "cell_type": "code", "execution_count": 79, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "\u001b[0;31mSignature:\u001b[0m \u001b[0mmake_palindrome\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mstring\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;31mDocstring:\u001b[0m Turns the string into a palindrome by concatenating itself with a reversed version of itself.\n", "\u001b[0;31mFile:\u001b[0m ~/GitHub/online-courses/python-programming-for-data-science/chapters/\n", "\u001b[0;31mType:\u001b[0m function\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "make_palindrome?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "But, even easier than that, if your cursor is in the function parentheses, you can use the shortcut `shift + tab` to open the docstring at will." ] }, { "cell_type": "code", "execution_count": 80, "metadata": {}, "outputs": [], "source": [ "# make_palindrome('uncomment this line and try pressing shift+tab here.')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Docstring Structure\n", "\n", "General docstring convention in Python is described in [PEP 257 - Docstring Conventions](https://www.python.org/dev/peps/pep-0257/). There are many different docstring style conventions used in Python. The exact style you use can be important for helping you to render your documentation, or for helping your IDE parse your documentation. Common styles include:\n", "\n", "1. **Single-line**: If it's short, then just a single line describing the function will do (as above).\n", "2. **reST style**: see [here](https://www.python.org/dev/peps/pep-0287/).\n", "3. **NumPy style**: see [here](https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_numpy.html). (RECOMMENDED! and MDS-preferred)\n", "4. **Google style**: see [here](https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html#example-google).\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The NumPy style:" ] }, { "cell_type": "code", "execution_count": 102, "metadata": {}, "outputs": [], "source": [ "def function_name(param1, param2, param3):\n", " \"\"\"First line is a short description of the function.\n", " \n", " A paragraph describing in a bit more detail what the\n", " function does and what algorithms it uses and common\n", " use cases.\n", " \n", " Parameters\n", " ----------\n", " param1 : datatype\n", " A description of param1.\n", " param2 : datatype\n", " A description of param2.\n", " param3 : datatype\n", " A longer description because maybe this requires\n", " more explanation and we can use several lines.\n", " \n", " Returns\n", " -------\n", " datatype\n", " A description of the output, datatypes and behaviours.\n", " Describe special cases and anything the user needs to\n", " know to use the function.\n", " \n", " Examples\n", " --------\n", " >>> function_name(3,8,-5)\n", " 2.0\n", " \"\"\"" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "def make_palindrome(string):\n", " \"\"\"Turns the string into a palindrome by concatenating \n", " itself with a reversed version of itself.\n", " \n", " Parameters\n", " ----------\n", " string : str\n", " The string to turn into a palindrome.\n", " \n", " Returns\n", " -------\n", " str\n", " string concatenated with a reversed version of string\n", " \n", " Examples\n", " --------\n", " >>> make_palindrome('tom')\n", " 'tommot'\n", " \"\"\"\n", " return string + string[::-1]" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "\u001b[0;31mSignature:\u001b[0m \u001b[0mmake_palindrome\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mstring\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;31mDocstring:\u001b[0m\n", "Turns the string into a palindrome by concatenating \n", "itself with a reversed version of itself.\n", "\n", "Parameters\n", "----------\n", "string : str\n", " The string to turn into a palindrome.\n", " \n", "Returns\n", "-------\n", "str\n", " string concatenated with a reversed version of string\n", " \n", "Examples\n", "--------\n", ">>> make_palindrome('tom')\n", "'tommot'\n", "\u001b[0;31mFile:\u001b[0m ~/GitHub/online-courses/python-programming-for-data-science/chapters/\n", "\u001b[0;31mType:\u001b[0m function\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "make_palindrome?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Docstrings with Optional Arguments\n", "\n", "When specifying function arguments, we specify the defaults for optional arguments:" ] }, { "cell_type": "code", "execution_count": 81, "metadata": {}, "outputs": [], "source": [ "# scipy style\n", "def repeat_string(s, n=2):\n", " \"\"\"\n", " Repeat the string s, n times.\n", " \n", " Parameters\n", " ----------\n", " s : str \n", " the string\n", " n : int, optional\n", " the number of times, by default = 2\n", " \n", " Returns\n", " -------\n", " str\n", " the repeated string\n", " \n", " Examples\n", " --------\n", " >>> repeat_string(\"Blah\", 3)\n", " \"BlahBlahBlah\"\n", " \"\"\"\n", " return s * n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Type Hints" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[Type hinting](https://docs.python.org/3/library/typing.html) is exactly what it sounds like, it hints at the data type of function arguments. You can indicate the type of an argument in a function using the syntax `argument : dtype`, and the type of the return value using `def func() -> dtype`. Let's see an example:" ] }, { "cell_type": "code", "execution_count": 83, "metadata": {}, "outputs": [], "source": [ "# NumPy style\n", "def repeat_string(s: str, n: int = 2) -> str: # <---- note the type hinting here\n", " \"\"\"\n", " Repeat the string s, n times.\n", " \n", " Parameters\n", " ----------\n", " s : str \n", " the string\n", " n : int, optional (default = 2)\n", " the number of times\n", " \n", " Returns\n", " -------\n", " str\n", " the repeated string\n", " \n", " Examples\n", " --------\n", " >>> repeat_string(\"Blah\", 3)\n", " \"BlahBlahBlah\"\n", " \"\"\"\n", " return s * n" ] }, { "cell_type": "code", "execution_count": 84, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "\u001b[0;31mSignature:\u001b[0m \u001b[0mrepeat_string\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0ms\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mstr\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mn\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mint\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;36m2\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m->\u001b[0m \u001b[0mstr\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;31mDocstring:\u001b[0m\n", "Repeat the string s, n times.\n", "\n", "Parameters\n", "----------\n", "s : str \n", " the string\n", "n : int, optional (default = 2)\n", " the number of times\n", " \n", "Returns\n", "-------\n", "str\n", " the repeated string\n", " \n", "Examples\n", "--------\n", ">>> repeat_string(\"Blah\", 3)\n", "\"BlahBlahBlah\"\n", "\u001b[0;31mFile:\u001b[0m ~/GitHub/online-courses/python-programming-for-data-science/chapters/\n", "\u001b[0;31mType:\u001b[0m function\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "repeat_string?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Type hinting just helps your users and IDE identify dtypes and identify bugs. It's just another level of documentation. They do not force users to use that date type, for example, I can still pass an `dict` to `repeat_string` if I want to:" ] }, { "cell_type": "code", "execution_count": 85, "metadata": { "tags": [ "raises-exception" ] }, "outputs": [ { "ename": "TypeError", "evalue": "unsupported operand type(s) for *: 'dict' and 'int'", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mrepeat_string\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m{\u001b[0m\u001b[0;34m'key_1'\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m'key_2'\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;36m2\u001b[0m\u001b[0;34m}\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;32m\u001b[0m in \u001b[0;36mrepeat_string\u001b[0;34m(s, n)\u001b[0m\n\u001b[1;32m 21\u001b[0m \u001b[0;34m\"BlahBlahBlah\"\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 22\u001b[0m \"\"\"\n\u001b[0;32m---> 23\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0ms\u001b[0m \u001b[0;34m*\u001b[0m \u001b[0mn\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mTypeError\u001b[0m: unsupported operand type(s) for *: 'dict' and 'int'" ] } ], "source": [ "repeat_string({'key_1': 1, 'key_2': 2})" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Most IDE's are clever enough to even read your type hinting and warn you if you're using a different dtype in the function, e.g., this VScode screenshot:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![](img/chapter2/type_hint_1.png)\n", "![](img/chapter2/type_hint_2.png)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.8" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": false, "sideBar": true, "skip_h1_title": true, "title_cell": "Lecture Outline", "title_sidebar": "Contents", "toc_cell": true, "toc_position": { "height": "calc(100% - 180px)", "left": "10px", "top": "150px", "width": "333.188px" }, "toc_section_display": true, "toc_window_display": true } }, "nbformat": 4, "nbformat_minor": 4 }