where()¶
You can use NumPy's where()
function as a vectorized form of "if array element meets condition, then x else y".
For example, given a 2-d array foo
We can create a corresponding array bar
which displays "cat" where foo
is even and "dog" where foo
is odd.
Math Functions¶
sum()¶
Consider this 2-d array, foo
.
There are numerous ways to take its sum with the sum()
function.
Sum all the values of foo
Sum across axis 0 (column sums)
Sum across axis 1 (row sums)
If foo
contains NaN
s, sum()
returns NaN
There are numerous ways to exclude NaN
s or treat them as 0s.
Here we use the where
parameter, telling the sum()
function to only include elements where ~np.isnan(foo)
evaluates to True;
Here we use the nan_to_num()
function, which converts NaN
s to 0 (by default).
Here we use the nansum()
function which treats NaN
s as 0s.
Other Math Functions¶
Unsurprisingly, there are numerous math functions in NumPy including
minimum()
,
maximum()
,
mean()
,
exp()
,
log()
,
floor()
, and
ceil()
among others.
Truth Value Testing¶
all()¶
You can use the all()
function to check if all the values in an array meet some condition.
Check if all the values are NaN
Check if all the values in each row are NaN
Check if all the values in each column are NaN
any()¶
You can use the any()
function to check if any of the values in an array meet some condition.
Check if any value is NaN
Check if any value in each row is NaN
Check if any value in each column is NaN
concatenate()¶
You can use the concatenate()
function to combine two or more arrays.
Concatenate roux
with a couple copies of itself row-wise.
Concatenate roux
with a couple copies of itself column-wise.
Concatenate roux
and gumbo
row-wise.
When you concatenate arrays, they must have the same exact shape excluding the axis along which you’re concatenating.
For example, if we try to concatenate roux
and gumbo
column-wise, NumPy throws an error.
- ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 0, the array at index 0 has size 3 and the array at index 1 has size 2
vstack()¶
vstack()
takes one argument - a sequence of arrays. You could describe its algorithm in pseudocode as
Visually, you could imagine vstack()
as vertically stacking 1-d or 2-d arrays.
Examples
- ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 2 and the array at index 1 has size 3
hstack()¶
hstack()
takes one argument - a sequence of arrays. You could describe its algorithm in pseudocode as
Visually, you could imagine hstack()
as horizontally stacking 1-d or 2-d arrays.
Examples
- ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 1 dimension(s) and the array at index 1 has 2 dimension(s)
- ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 0, the array at index 0 has size 1 and the array at index 1 has size 2
stack()¶
stack()
takes two arguments:
- a sequence of arrays to combine
axis
which tellsstack()
to create a new axis along which to combine the arrays.
You could describe its algorithm in pseudocode as
Examples
- numpy.AxisError: axis 2 is out of bounds for array of dimension 2
Sorting¶
You can use numpy’s sort()
function to sort the elements of an array.
sort()
takes three primary parameters:
a
: the array you want to sortaxis
: the axis along which to sort. (The default, -1, sorts along the last axis.)kind
: the kind of sort you want numpy to implement. By default, numpy implements quicksort.
For example, here we make a 1-d array, foo, and then sort it in ascending order.
Note that the original array remains unchanged.
If you want to sort the values of foo
in place, use the sort
method of the array object.
Sort with NaN¶
If you have an array with NaN values, sort()
pushes them to the end of the array.
Sort In Descending Order (Reverse Sort)¶
Unfortunately NumPy doesn't have a direct way of sorting arrays in descending order. However, there are multiple ways to accomplish this.
- Sort the array in ascending order and then reverse the result.
- Negate the array’s values, sort those in ascending order, and then negate that result.
The main difference between these techniques is that the first method pushes NaN
s to the front of the array
and the second method pushes NaN
s to the back. Also, the second method won’t work on strings since you can’t negate a
string.
Sorting A Multidimensional Array¶
What if you wanted to sort a multidimensional array like this?
In this case, you can use the axis
parameter of the sort()
function to specify which axis to sort along.
Sort each column of a 2-d array¶
Sort each row of a 2-d array¶
Sort the last axis of an array¶
-
Since
boo
is a 2-d array, the last axis, 1, is the column axis. Thusnp.sort(boo, axis=-1)
is equivalent tonp.sort(boo, axis=1)
.
Tip
When we talk about sorting along an axis, each element's position in the array remains fixed except for that axis. For example, observe the 20 in boo
. When we sort along the row axis (axis 0), only its row coordinate changes (from (1,0)
to (0,0)
). When we sort along the column axis (axis 1), only its column coordinate changes (from (1,0)
to (1,1)
). That's why sorting along axis 0 does column sorts in a 2-d array and sorting along axis 1 does row sorts in a 2-d array.
argsort()¶
argsort()
works just like sort(), except it returns an array of indices indicating the position each value of the array would map to in the sorted case.
Example
Here, argsort()
tells us:
- the smallest element of
foo
is at position 1 - the second smallest element of
foo
is at position 0 - the third smallest element of
foo
is at position 3 - the fourth smallest element of
foo
is at position 2
If you used this array to index the original array, you’d get its sorted form (just as if you had called np.sort (foo)
).
Sort the rows of a 2-d array according to its first column¶
If you want to reorder the rows of boo
according to the values in its first column, you can plug in the index
array [1, 0, 2]
.
To create the index array dynamically, simply call argsort()
on the first column of boo
.
Stable Sorting¶
The previous example raises an important question. If an array has repeated values, how do we guarantee that sorting
them won't alter the order they appear in the original array? For example, given boo
this
and this
are both valid sorts of boo
along its first column, but only the first array retains the original order of the
rows beginning with 55. This is known as a [stable sorting algorithm](https://en.wikipedia.
org/wiki/Sorting_algorithm#Stability). By default,
np.sort()
and np.argsort()
don't use a stable sorting algorithm. If you'd like to use a stable sort, set the kind
parameter equal to 'stable'
.
unique()¶
You can use the unique()
function to get the unique elements of an array.
Example
You may have noticed that 'b' appeared first in the input but 'a' appeared first in the output. That's because unique()
returns the unique elements in sorted order.
Get unique elements in order of first occurrence¶
You can use return_index=True
to get index of first occurrence of each element in an array.
With return_index=True
, numpy returns a tuple containing
- the unique elements array
- a corresponding array with the index at which each element first occurred in the original array
In the above example 'a' first occurred at index 2 in the original array, b first occurred at index 0, and so on.
If you want to reorder the unique elements in the same order they occurred in the original array, use argsort()
the index array and use that to sort the unique elements array.
unique() with counts¶
You can use return_counts=True
to additionally return the count of each element.