{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "XD4cDJcke_7v" }, "source": [ "# Introduction to NumPy\n", "\n", "* [Notebook](https://colab.research.google.com/drive/1qz1oLQxXZf44rjkKx3Xevngx1MEKV0yY) of this section (open in a separate tab)\n", "\n", "* [Video](https://youtu.be/z98FrHFfXaI) of this section (18 minutes)\n", "\n", "NumPy is a numerical computation library for Python." ] }, { "cell_type": "markdown", "source": [ "## Overview" ], "metadata": { "id": "kSff_moUGy1c" } }, { "cell_type": "markdown", "source": [ "**NumPy***.\n", "- Python provides many useful modules (libraries).\n", "- NumPy is a module for numerical computation.\n", "- You can load NumPy by `import numpy`.\n", "- As `import numpy as np` (to read numpy as np), it is common to write `np.array()` instead of `numpy.array()`, for example\n", "- You can convert a list to a NumPy array with `numpy.array(list)`.\n", "- NumPy arrays are equivalent to vectors, matrices, and tensors.\n", "- Note that a list and a NumPy array look similar but are different." ], "metadata": { "id": "1JWWoDRFD32H" } }, { "cell_type": "markdown", "source": [ "**matplotlib.pyplot***.\n", "- Library for drawing graphs, etc. in Python.\n", "- It is common to load `import matplotlib.pyplot as plt` and name it as plt\n" ], "metadata": { "id": "k8CEF6NiD38Y" } }, { "cell_type": "markdown", "source": [ "Other Frequently Used Libraries\n", "\n", "**Pandas***.\n", "\n", " - Pandas is a library for working with two-dimensional tables like Excel\n", " - It is common to read `import pandas as pd` and name it `pd\n", "- `pd.read_csv()` used to read csv files in this course\n", "\n", "**OpenCV (cv2)**\n", "- Library with rich functionality for handling images\n", "\n", "**scikit-learn***\n", "- Module for machine learning. It has a full range of functions other than neural networks.\n", "\n" ], "metadata": { "id": "-obcXOxrD35H" } }, { "cell_type": "markdown", "metadata": { "id": "CFlQ6Wv7hROd" }, "source": [ "## Generate NumPy array\n", "\n", "np.array(list)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "_b-pHtO_aLDm" }, "outputs": [], "source": [ "# Import NumPy and name it np.\n", "import numpy as np" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "k-saf3S4i-p9", "outputId": "889df197-6652-4d50-c00b-f0724a3284b7" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n" ] }, { "data": { "text/plain": [ "array([1, 2, 3])" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x = np.array([1,2,3])\n", "print(type(x))\n", "x" ] }, { "cell_type": "markdown", "metadata": { "id": "wSq0KoAZzWaS" }, "source": [ "`array.shape` is type as array" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "oaVTqCxyzVnt", "outputId": "26ddbf50-b74e-4264-9f52-f487b1b071c5" }, "outputs": [ { "data": { "text/plain": [ "(3,)" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x.shape" ] }, { "cell_type": "markdown", "metadata": { "id": "qg2PYQhGj01C" }, "source": [ "where (3,) indicates that it is a 3-dimensional (three-element) vector." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "VBRtfXJL3am2", "outputId": "6f254302-bb2f-4833-c7c0-33d05ff9e7d7" }, "outputs": [ { "data": { "text/plain": [ "array([0., 0., 0.])" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.zeros(3,) # np.zeros(array type) creates an array with all 0's" ] }, { "cell_type": "markdown", "metadata": { "id": "ce91xfUX3beg" }, "source": [ "matrix of $A = \\begin{bmatrix}1&2&3&4\\\\2&3&4&5\\\\3&4&5&6\\end{bmatrix}$" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "kUJ5qTpPjAXE", "outputId": "629e7a46-e0d5-4c10-aff5-e1667351ad60" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(3, 4)\n" ] }, { "data": { "text/plain": [ "array([[1, 2, 3, 4],\n", " [2, 3, 4, 5],\n", " [3, 4, 5, 6]])" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "aa = np.array([[1,2,3,4],\n", " [2,3,4,5],\n", " [3,4,5,6]])\n", "print(aa.shape)\n", "aa" ] }, { "cell_type": "markdown", "metadata": { "id": "gHhNjYFU0DgN" }, "source": [ "The way to refer to a value is the same as for a list.\n", "The following two lines refer to the same thing." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "R_ktanJ-z1dI", "outputId": "d036d9e7-c40f-4d58-fba0-d5e965319bd9" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2\n" ] }, { "data": { "text/plain": [ "2" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "print(aa[0][1])\n", "aa[0,1]" ] }, { "cell_type": "markdown", "metadata": { "id": "uBAHlBDO2dyO" }, "source": [ "Unlike lists, arrays must be rectangular." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "LSJafq9ZcQ8R", "outputId": "fbd2a419-2d0a-4c29-93e6-fdcc1420653f" }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/usr/local/lib/python3.7/dist-packages/ipykernel_launcher.py:1: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.\n", " \"\"\"Entry point for launching an IPython kernel.\n" ] } ], "source": [ "a = np.array([[1,2],[3]])" ] }, { "cell_type": "markdown", "metadata": { "id": "aOlkSikSvD75" }, "source": [ "## Calculating NumPy arrays\n", "\n", "Basically, it is the same as computing a matrix. For example\n", "$$\\begin{bmatrix}1&2&3\\\\4&5&6\\end{bmatrix}\n", "+\\begin{bmatrix}1&1&1\\\\2&2&2\\end{bmatrix}\n", "\\quad \\left(=\n", "\\begin{bmatrix}2&3&4\\\\6&7&8\\end{bmatrix}\\right)$$\n", "can be written as follows." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "PIvXK9kokO2Z", "outputId": "85ce4711-3b58-4830-8757-b1a899242c7a" }, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "array([[2, 3, 4],\n", " [6, 7, 8]])" ] }, "metadata": {}, "execution_count": 3 } ], "source": [ "x = np.array([[1,2,3],\n", " [4,5,6]])\n", "y = np.array([[1,1,1],\n", " [2,2,2]])\n", "\n", "x + y" ] }, { "cell_type": "markdown", "metadata": { "id": "-XhUV4rzDRNo" }, "source": [ "x * y is a multiplication of the components." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "6QA5AdXdDRqn", "outputId": "aecda757-98eb-4834-b2c0-3a5f28dd6109" }, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "array([[ 1, 2, 3],\n", " [ 8, 10, 12]])" ] }, "metadata": {}, "execution_count": 4 } ], "source": [ "x * y" ] }, { "cell_type": "markdown", "metadata": { "id": "ewhTiLOQv8jm" }, "source": [ "Scalar multiplication is performed using \"*\", the same as for ordinary multiplication." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "aH_l03XSvura", "outputId": "43078dea-f464-453d-a61d-947d08c1440b" }, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "array([[ 2, 4, 6],\n", " [ 8, 10, 12]])" ] }, "metadata": {}, "execution_count": 5 } ], "source": [ "2 * x" ] }, { "cell_type": "markdown", "metadata": { "id": "4LFlDHGY84xe" }, "source": [ "product of matrices" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "W4Su3Fhf85Bs", "outputId": "37936d1f-5082-4630-94d3-212dddd6d155" }, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "array([[ 6, 12],\n", " [15, 30]])" ] }, "metadata": {}, "execution_count": 6 } ], "source": [ "np.dot(x, y.T) # y.T is the transpose of y" ] }, { "cell_type": "markdown", "metadata": { "id": "hd3lTwfUwxVc" }, "source": [ "**Unlike what you normally learn in linear algebra**, you can also add scalars.\n", "The scaler is added to all the components.\n", "$$\\begin{bmatrix}1&2&3\\\\4&5&6\\end{bmatrix} + 2\n", "= \\begin{bmatrix}3&4&5\\\\6&7&8\\end{bmatrix}\n", "$$\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "PwBCfPzNv6hd", "outputId": "fe56c247-8946-4700-eac7-21dbaa4cb35c" }, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "array([[3, 4, 5],\n", " [6, 7, 8]])" ] }, "metadata": {}, "execution_count": 7 } ], "source": [ "x + 2" ] }, { "cell_type": "markdown", "metadata": { "id": "8rEA2UjYw658" }, "source": [ "The feature of applying a function to all elements in this way is called **Broadcast**.\n", "\n", "Broadcast can be used with many NumPy functions in addition to adding scalars. For example, `np.sin()` applies the sine function to all components." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "oh_ewQGkw5xg", "outputId": "be390f62-ad14-447c-f0ef-2ed16b0afb6c" }, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "array([[ 0.84147098, 0.90929743, 0.14112001],\n", " [-0.7568025 , -0.95892427, -0.2794155 ]])" ] }, "metadata": {}, "execution_count": 8 } ], "source": [ "np.sin(x)" ] }, { "cell_type": "markdown", "metadata": { "id": "GQtlncO8DVHM" }, "source": [ "Also, > and == return a boolean value (True = 1, False = 0) for each element, which can also be broadcasted." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "nB3yo3ptDVzS", "outputId": "b70b7bcf-224b-4868-d7f6-964e940e9c0b" }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "[[False False True]\n", " [ True True True]]\n", "[[False False True]\n", " [False False False]]\n" ] } ], "source": [ "print(x>2)\n", "print(x==3)" ] }, { "cell_type": "markdown", "metadata": { "id": "37jcwiEQ9LSp" }, "source": [ "## Statistic" ] }, { "cell_type": "markdown", "metadata": { "id": "Rq1G4rikCofA" }, "source": [ "Maximum values" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "2dUQcEs59RDq", "outputId": "138c9d52-c07a-40e9-9532-8765a2db7685" }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "[[1 2 3]\n", " [4 5 6]]\n", "6\n", "6\n" ] } ], "source": [ "print(x)\n", "\n", "# The following two lines both return the maximum value of the array\n", "print(np.max(x))\n", "print(x.max())" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "IHAIxF1H9qy-", "outputId": "1870287e-7951-47e5-edd4-2d6007ce3d8c" }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "[4 5 6]\n", "[3 6]\n" ] } ], "source": [ "#maximum value for the 0th index\n", "print(np.max(x, axis =0))\n", "\n", "#maximum value for the first index\n", "print(x.max(axis =1))" ] }, { "cell_type": "markdown", "metadata": { "id": "jupOcqJ0-_Rm" }, "source": [ "Index that takes the maximum value" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "LfJ24Y_C_DwP", "outputId": "8c8165a8-da48-40d9-cbb6-5254f16d2f0e" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "5\n" ] }, { "data": { "text/plain": [ "array([1, 1, 1])" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "print(x.argmax()) # Returns the index of the entire as a first order array.\n", "x.argmax(axis = 0)" ] }, { "cell_type": "markdown", "metadata": { "id": "zYS87z3G-JXh" }, "source": [ "sum" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "sp30I7TX-JXs", "outputId": "607c74f3-8760-449a-b75a-bc21b12d22a5" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "21\n", "21\n" ] } ], "source": [ "# The following two lines both return the sum of the array components\n", "print(np.sum(x))\n", "print(x.sum())" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "6sKI7pQt-JX3", "outputId": "122d003d-4f0d-44ac-9fad-46caab061f08" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[5 7 9]\n", "[ 6 15]\n" ] } ], "source": [ "# 第0インデックスに関する和\n", "print(np.sum(x, axis =0))\n", "\n", "#第1インデックスに関する和\n", "print(x.sum(axis =1))" ] }, { "cell_type": "markdown", "metadata": { "id": "8RVQCM5K-gY4" }, "source": [ "Mean, variance, and standard deviation" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "ExkggfxP-lrO", "outputId": "e2d7a526-b244-47b0-c911-c038b5bf47ef" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[2.5 3.5 4.5]\n", "[2.25 2.25 2.25]\n", "[1.5 1.5 1.5]\n" ] } ], "source": [ "print( x.mean(axis = 0) )\n", "print( x.var(axis = 0) )\n", "print( x.std(axis = 0) )" ] }, { "cell_type": "markdown", "metadata": { "id": "EJtmM2ujHIY7" }, "source": [ "## Matplotlib\n", "\n", "Matplotlib is a Python library for drawing, and pyplot is a module (i.e., a subset of the library) of Matplotlib for drawing graphs.\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 684 }, "id": "khWGY7kU_AaH", "outputId": "f14b3432-876a-43b9-f636-318e326138f8" }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "x = [0. 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1. 1.1 1.2 1.3 1.4 1.5 1.6 1.7\n", " 1.8 1.9 2. 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 3. 3.1 3.2 3.3 3.4 3.5\n", " 3.6 3.7 3.8 3.9 4. 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 5. 5.1 5.2 5.3\n", " 5.4 5.5 5.6 5.7 5.8 5.9]\n", "y = [ 0. 0.09983342 0.19866933 0.29552021 0.38941834 0.47942554\n", " 0.56464247 0.64421769 0.71735609 0.78332691 0.84147098 0.89120736\n", " 0.93203909 0.96355819 0.98544973 0.99749499 0.9995736 0.99166481\n", " 0.97384763 0.94630009 0.90929743 0.86320937 0.8084964 0.74570521\n", " 0.67546318 0.59847214 0.51550137 0.42737988 0.33498815 0.23924933\n", " 0.14112001 0.04158066 -0.05837414 -0.15774569 -0.2555411 -0.35078323\n", " -0.44252044 -0.52983614 -0.61185789 -0.68776616 -0.7568025 -0.81827711\n", " -0.87157577 -0.91616594 -0.95160207 -0.97753012 -0.993691 -0.99992326\n", " -0.99616461 -0.98245261 -0.95892427 -0.92581468 -0.88345466 -0.83226744\n", " -0.77276449 -0.70554033 -0.63126664 -0.55068554 -0.46460218 -0.37387666]\n" ] }, { "output_type": "display_data", "data": { "text/plain": [ "
" ], "image/png": "\n" }, "metadata": {} } ], "source": [ "import numpy as np\n", "import matplotlib.pyplot as plt\n", "\n", "x = np.arange(0, 6, 0.1) # Generate array of 0.1 increments from 0 to 6\n", "y = np.sin(x) # Broadcast sin() to all components of array x\n", "\n", "print('x = ',x)\n", "print('y = ',y)\n", "\n", "plt.plot(x,y)\n", "\n", "plt.show()# instruction to display (works without, but also displays extra information)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 430 }, "id": "DzoZoQalItua", "outputId": "ac73aa53-ca5d-474f-cb9e-7ce6d17566e6" }, "outputs": [ { "output_type": "display_data", "data": { "text/plain": [ "
" ], "image/png": "\n" }, "metadata": {} } ], "source": [ "y1 = np.sin(x)\n", "y2 = np.cos(x)\n", "\n", "plt.plot(x,y1, label='sin')\n", "plt.plot(x,y2, label='cos', linestyle= '--')\n", "plt.legend() # Display descriptions for graphs\n", "\n", "plt.show()" ] } ], "metadata": { "colab": { "provenance": [] }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.8" } }, "nbformat": 4, "nbformat_minor": 0 }