NumPy array basics B
This chapter is the continuation from NumPy Array Basics A. We've been playing with the following NumPy array:
>>> import numpy as np >>> rArray = np.arange(0,20).reshape((5,4)) >>> rArray array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11], [12, 13, 14, 15], [16, 17, 18, 19]])
In previous chapter, we've learned slicing NumPy array works just like python list.
Much like python lists, we can assign values to specific positions:
>>> squareArray = rArray[:3:,:3,] >>> squareArray array([[ 0, 1, 2], [ 4, 50, 6], [ 8, 9, 100]]) >>> squareArray[0,0] *= 10 >>> squareArray[1,1] *= 10 >>> squareArray[2,2] *= 10 >>> squareArray array([[ 0, 1, 2], [ 4, 500, 6], [ 8, 9, 1000]])
We can slice a NumPy array, and assign values to it. The example below, slices the first row and assigns -1 to the elements of the 1st row:
>>> squareArray array([[ 0, 1, 2], [ 4, 500, 6], [ 8, 9, 1000]]) >>> squareArray[:1:,] = -1 >>> squareArray array([[ -1, -1, -1], [ 4, 500, 6], [ 8, 9, 1000]])
We can indexing NumPy array using an array of indices:
>>> import numpy as np >>> indxArr = np.array([0,1,1,2,3]) >>> indxArr array([0, 1, 1, 2, 3]) >>> rnd = np.random.random((10,1)) >>> rnd array([[ 0.20903716], [ 0.98787586], [ 0.12038364], [ 0.54208977], [ 0.49319279], [ 0.77011847], [ 0.57856482], [ 0.55202036], [ 0.58084383], [ 0.45641956]]) >>> rnd[indxArr] array([[ 0.20903716], [ 0.98787586], [ 0.98787586], [ 0.12038364], [ 0.54208977]])
We first defined NumPy index array, indxArr, and then use it to access elements of random NumPy array, rnd. As we can see from the output, we were able to get 0th, 1st, 1st, 2nd, and 3rd elements of the random array.
The np.empty(...) is filled with random/junk values:
>>> import numpy as np >>> emptyArray = np.empty((2,3)) >>> emptyArray array([[ 0.00000000e+000, 3.39519327e-313, 0.00000000e+000], [ 4.94065646e-324, 1.83322544e-316, 6.94110822e-310]])
It looks like random, but it's not. So, if we need real random numbers, we should not use this empty(...).
We can do indexing NumPy array with boolean array:
>>> squareArray array([[ -1, -1, -1], [ 4, 500, 6], [ 8, 9, 1000]]) >>> boolArray array([[ True, False, False], [False, True, False], [False, False, True]], dtype=bool) >>> squareArray[boolArray] array([ -1, 500, 1000])
We set the index array with bool value True only for the diagonal elements, and we were able to get only those items as 1-D array.
>>> squareArray array([[ -1, -1, -1], [ 4, 500, 6], [ 8, 9, 1000]]) >>> indxRow = np.array([False, True, False]) >>> indxCol = np.array([True, False, True]) >>> squareArray[indxRow, indxCol] array([4, 6])
indxRow wanted only the 2nd row, and indxCol wanted only the 1st and the 3rd, and we got the right one.
We can use a boolean matrix based on some test and use that as an index in order to get the elements of a matrix that pass the test:
We can use a boolean matrix based on some test and use that as an index in order to get the elements of a matrix that pass the test:
>>> import numpy as np >>> myArray = np.arange(0,9).reshape(3,3) >>> myArray array([[0, 1, 2], [3, 4, 5], [6, 7, 8]]) >>> myAverage = np.average(myArray) >>> myAverage 4.0 >>> aboveAverage = myArray > myAverage >>> aboveAverage array([[False, False, False], [False, False, True], [ True, True, True]], dtype=bool) >>> myArray[aboveAverage] array([5, 6, 7, 8])
The way of indexing as in the previous section can also be used to assign values to elements of the array. This is particularly useful if we want to filter an array. We can sure that all of its values are above/below a certain threshold:
We'll use std(...) which returns the standard deviation of all the elements in the given array.
>>> import numpy as np >>> myArray = np.arange(0,9).reshape(3,3) >>> myArray array([[0, 1, 2], [3, 4, 5], [6, 7, 8]]) ... >>> myArray[aboveAverage] array([5, 6, 7, 8]) >>> standDeviation = np.std(myArray) >>> standardDev = np.std(myArray) >>> standardDev 2.5819888974716112
We'll make a copy of myArray that will be clamped. It will only contain values within one standard deviation of the mean. Values that are too low or to high will be set to the min and max respectively. We set dtype=float because usually myAverage and standardDev are floating point numbers.
>>> clampedMyArray = np.array(myArray.copy(), dtype=float) >>> clampedMyArray array([[ 0., 1., 2.], [ 3., 4., 5.], [ 6., 7., 8.]]) >>> clampedMyArray[ (myArray-myAverage) > standardDev ] = myAverage+standardDev >>> clampedMyArray[ (myArray-myAverage) < -standardDev ] = myAverage-standardDev >>> clampedMyArray array([[ 1.4180111, 1.4180111, 2. ], [ 3. , 4. , 5. ], [ 6. , 6.5819889, 6.5819889]]) >>>
>>> import numpy as np >>> myArray = np.arange(1,10).reshape(3,3) >>> myArray array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) >>> myArray*10 array([[10, 20, 30], [40, 50, 60], [70, 80, 90]]) >>> myZeros = myArray * np.zeros((3,3)) >>> myZeros array([[ 0., 0., 0.], [ 0., 0., 0.], [ 0., 0., 0.]]) >>> addedOnes = myArray + np.ones((3,3)) >>> addedOnes array([[ 2., 3., 4.], [ 5., 6., 7.], [ 8., 9., 10.]])
{dot} is actually matrix multiplication:
>>> A = np.array( [ [1,2],[3,4] ] ) >>> B = np.array( [ [5,6],[7,8] ] ) >>> np.dot(A,B) array([[19, 22], [43, 50]])
>>> a = np.arange(10) >>> a array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >>> b = a + 1 >>> b array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]) >>> np.append(b, [11,12]) array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
>>> numpy.where(condition[, x, y])
It return elements, either from x or y, depending on condition.
If only condition is given, return condition.nonzero().
Parameters :
- condition : array_like, bool When True, yield x, otherwise yield y.
- x, y : array_like, optional Values from which to choose. x and y need to have the same shape as condition.
Returns :
out : ndarray or tuple of ndarrays If both x and y are specified, the output array contains elements of x where condition is True, and elements from y elsewhere. If only condition is given, return the tuple condition.nonzero(), the indices where condition is True.
Only the condition is given, where returns the indices for the elements that satisfying the condition:
>>> x = np.arange(9).reshape(3,3) >>> x array([[0, 1, 2], [3, 4, 5], [6, 7, 8]]) >>> np.where(x>7) (array([2]), array([2]))
It returns (2,2) as indices for the element '8'.
>>> np.where(x <= 3) (array([0, 0, 0, 1]), array([0, 1, 2, 0]))
If several elements meet the condition, it returns indices for all those elements (1st 4 elements are <= 3) as shown in the example above.
Following example shows the case when both x and y are given with the condition. If condition is met, where() returns x-element, otherwise it returns y-element. In the example, it returns x's diagonal elements, and y's elements are returned off-diagonal positions:
>>> np.where([ [True, False, False], [False, True, False], [False, False, True] ], ... [ [11, 12, 13], [21, 22, 23], [31, 32, 33] ], ... [ [911, 912, 913], [921, 922, 923], [931, 932, 933] ]) array([[ 11, 912, 913], [921, 22, 923], [931, 932, 33]])
In the same context, the example below returns diagonal indices (0,0), (1,1), and (2,2):
>>> np.where( [ [1,0,0],[0,1,0],[0,0,1] ]) (array([0, 1, 2]), array([0, 1, 2]))
Let's look at more cases;
>>> x = np.arange(9.).reshape(3, 3) >>> x array([[ 0., 1., 2.], [ 3., 4., 5.], [ 6., 7., 8.]]) >>> np.where( x > 5 ) (array([2, 2, 2]), array([0, 1, 2])) >>> x[np.where( x > 5 )] array([ 6., 7., 8.]) >>> x[np.where(x<5)] array([ 0., 1., 2., 3., 4.])
The following example works as a sort of mask: if any element of x not satisfying x < 5, -1 will replace the element:
>>> np.where(x<5, x, -1) array([[ 0., 1., 2.], [ 3., 4., -1.], [-1., -1., -1.]])
>>> x = [1,2,3,4,5] >>> y = [11,12,13,14,15] >>> condition = [True,False,True,False,True] >>> [xv if c else yv for (c,xv,yv) in zip(condition,x,y)] [1, 12, 3, 14, 5]
The same thing can be done using NumPy's where:
>>> import numpy as np >>> np.where([1,0,1,0,1], np.arange(1,6), np.arange(11,16)) array([ 1, 12, 3, 14, 5])
The astype cast to a specified type:
>>> x = np.array([1.1, 2.2, 3.3]) >>> x array([ 1.1, 2.2, 3.3]) >>> x.astype(int) array([1, 2, 3])
A little bit complex example:
mask2 = np.where((mask==2)|(mask==0),0,1).astype('uint8')
If mask==2 or mask== 1, mask2 get 0, other wise it gets 1 as 'uint8' type.
Ph.D. / Golden Gate Ave, San Francisco / Seoul National Univ / Carnegie Mellon / UC Berkeley / DevOps / Deep Learning / Visualization