NumPy array basics A
Lists in Python are quite general, and can have arbitrary objects as elements. Addition and scalar multiplication are defined for lists. However, lists won't give us what we want for numerical computations as shown in the following examples:
Multiplication - repeats:
>>> a = [1, 2] >>> 2*a [1, 2, 1, 2]
Addition - concatenates:
>>> a = [1, 2] >>> b = [3, 4] >>> a + b [1, 2, 3, 4]
If we do the same operations with NumPy, we get:
>>> import numpy as np >>> a = np.array([1, 2]) >>> 2*a array([2, 4]) >>> >>> b = np.array([3, 4]) >>> a + b array([4, 6])
Also, note here that the '*' does component-wise multiplication:
>> x = np.array(range(5)) >>> x array([0, 1, 2, 3, 4]) >>> np.sqrt(x) * x + np.cos(x) array([ 1. , 1.54030231, 2.41228029, 4.20615993, 7.34635638])
One more thing related to the data type before we dive into NumPy section. Unlike lists, all elements of an np.array have the same type:
>>> np.array([1., 2., 3.]) # all floats array([ 1., 2., 3.]) >>> np.array([1. , 2, 3]) # one float array([ 1., 2., 3.]) # all elements become float
NumPy can explicitly state data type:
>>> np.array([1, 2, 3], dtype=complex) array([ 1.+0.j, 2.+0.j, 3.+0.j])
NumPy is a Python extension to add support for large, multi-dimensional arrays and matrices, along with a large library of high-level mathematical functions.
>>> import numpy as np >>> x = np.array([1,2,3]) >>> x array([1, 2, 3]) >>>
We can also create an array using an input:
>>> use_me = [ [1,2,3],[4,5,6]] >>> myArray = np.array(use_me) >>> myArray array([[1, 2, 3], [4, 5, 6]])
We can fill all elements with zeros.
>>> import numpy as np >>> zeroArray = np.zeros((2,4)) >>> zeroArray array([[ 0., 0., 0., 0.], [ 0., 0., 0., 0.]])
Or ones:
>>> onesArray = np.ones((4,2)) >>> onesArray array([[ 1., 1.], [ 1., 1.], [ 1., 1.], [ 1., 1.]])
The np.empty(...) is filled with random/junk values:
>>> import numpy as np >>> emptyArray = np.empty((2,3)) >>> emptyArray array([[ 0.00000000e+000, 3.39519327e-313, 0.00000000e+000], [ 4.94065646e-324, 1.83322544e-316, 6.94110822e-310]])
It looks like random, but it's not. So, if we need real random numbers, we should not use this empty(...).
numpy.arange([start], stop[, step], dtype=None)
If we want to specify the step, then the start should be spefied:
>>> a = np.arange(5, 10, 0.5) >>> a array([ 5. , 5.5, 6. , 6.5, 7. , 7.5, 8. , 8.5, 9. , 9.5])
>>> import numpy as np >>> randomArray = np.random.random((4,4)) >>> randomArray array([[ 0.10085918, 0.44759528, 0.40433292, 0.30975764], [ 0.2023531 , 0.88821789, 0.71853805, 0.64503574], [ 0.36394454, 0.01794277, 0.09041095, 0.74117827], [ 0.41225956, 0.20244151, 0.59867229, 0.80260473]])
np.random.random(...) is actually using a random number generator to fill in each of the spots in the array with a randomly sampled number from 0 to 1.
We can specify low and high as shown in the example below (low = 1, high = 10)
>>> a = np.random.randint(1, 10, (5,2)) >>> a array([[3, 2], [8, 4], [5, 2], [3, 2], [4, 4]])
>>> import numpy as np >>> rArray = np.arange(0,20).reshape((5,4)) >>> rArray array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11], [12, 13, 14, 15], [16, 17, 18, 19]]) >>>
- Note that the arange(...) function returns a 1D array similar to what we'd get from using the built-in python function range(...) with the same arguments.
>>> np.arange(0,20) array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19])
- The reshape method takes the data in an existing array, and puts it into an array with the given shape and returns it.
>>> rArray.reshape((2,10)) array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9], [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]])
- Note that the original rArray stays there not changed by another reshape(...):
>>> rArray array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11], [12, 13, 14, 15], [16, 17, 18, 19]])
- When we use reshape(...), the total number of elements in the array must remain the same. So, reshaping an array with 4 rows and 5 columns into one with 10 rows and 2 columns is fine, but 5x5 or 7x3 would fail:
>>> rArray.reshape((5,5)) Traceback (most recent call last): File "
", line 1, in ValueError: total size of new array must be unchanged
The shape attribute for numpy arrays returns the dimensions of the array. If Arr has m rows and m columns, then Arr.shape is (m,n). So Arr.shape[0] is m and Arr.shape[1] is n. Also, Arr.shape[-1] is n, Arr.shape[-2] is m.
>>> a = np.arange(10) >>> a array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >>> a.shape (10,) >>> b = a.reshape(5,-1) >>> b array([[0, 1], [2, 3], [4, 5], [6, 7], [8, 9]]) >>> b.shape (5, 2) >>> b.shape[0] 5 >>> b.shape[1] 2 >>> b.shape[-1] 2 >>> b.shape[-2] 5
Accessing an array is pretty much straight forward. We access a specific location in the table by referring to its row and column inside square braces.
>>> rArray = np.arange(0,20).reshape((5,4)) >>> rArray array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11], [12, 13, 14, 15], [16, 17, 18, 19]])
To get an element, we specify rArray[row, column]:
>>> rArray[2,3] 11 >>> rArray[4,0] 16
Note that the index starts from 0.
We can also refer to ranges inside an array:
>>> rArray[3,1:3] array([13, 14]) >>> rArray[2:5,1:4] array([[ 9, 10, 11], [13, 14, 15], [17, 18, 19]])
These ranges work just like slices for lists. s:e:step specifies a range that starts at s, and stops before e, in steps size of step. If any of these are left off, they're assumed to be the s, the e+1, and 1, respectively.
If we want only the elements in the first column, we do this:
>>> rArray[:,0:5:4] array([[ 0], [ 4], [ 8], [12], [16]]) >>> rArray[:,0] array([ 0, 4, 8, 12, 16])
If we want only the 0th, 2nd, 4th rows:
>>> rArray[0:5:2,:] array([[ 0, 1, 2, 3], [ 8, 9, 10, 11], [16, 17, 18, 19]])
Or we can left off for the defaults:
>>> rArray[::2,] array([[ 0, 1, 2, 3], [ 8, 9, 10, 11], [16, 17, 18, 19]])
We can use minus(-) index.
To get the last column only:
>>> rArray array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11], [12, 13, 14, 15], [16, 17, 18, 19]]) >>> rArray[:,-1] array([ 3, 7, 11, 15, 19])
For the 2nd from the last:
>>> rArray[:,-2] array([ 2, 6, 10, 14, 18])
The np.newaxis can be used to adding additional dimension to the np.ndarray:
>>> a = np.arange(6).reshape(3,2) >>> a array([[0, 1], [2, 3], [4, 5]]) >>> new_a = a[:,:, np.newaxis] >>> new_a array([[[0], [1]], [[2], [3]], [[4], [5]]]) >>> new_a.shape (3, 2, 1)
Column vector:
>>> c = np.array([1,2,3]) >>> c array([1, 2, 3]) >>> c.shape (3,) >>> c.size 3
row vector:
>>> r = np.array([ [1,2,3] ]) >>> r array([[1, 2, 3]]) >>> r.shape (1, 3) >>> r.size 3 >>> r[0,0] 1 >>> r[0,1] 2 >>> r[0,2] 3
To join a sequence of arrays together, we use numpy.concatenate():
numpy.concatenate((a1, a2, ...), axis=0)
Here, axis denotes the axis along which the arrays will be joined. Default is 0 (row join).
row concatenate:
>>> import numpy as np >>> a = np.array([[1,2], [3,4]]) >>> a.shape (2, 2) >>> b = np.array([[5, 6]]) >>> b.shape (1, 2) >>> # row join >>> np.concatenate((a, b), axis=0) array([[1, 2], [3, 4], [5, 6]])
column concatenate:
>>> a = np.array([[1,2], [3,4]]) >>> a array([[1, 2], [3, 4]]) >>> a.shape (2, 2) >>> b = np.array([[5, 6]]) >>> b array([[5, 6]]) >>> b.shape (1, 2) >>> np.concatenate((a,b),axis=1) Traceback (most recent call last): File "", line 1, in ValueError: all the input array dimensions except for the concatenation axis must match exactly
We need to have the same shapes for the arrays to concatenate. So, b should be transposed:
>>> bt = b.T >>> bt array([[5], [6]]) >>> bt.shape (2, 1) >>> np.concatenate((a,bt),axis=1) array([[1, 2, 5], [3, 4, 6]])
Continued to NumPy Array Basics B.
Ph.D. / Golden Gate Ave, San Francisco / Seoul National Univ / Carnegie Mellon / UC Berkeley / DevOps / Deep Learning / Visualization