What is NumPy ndarray?


The N-dimensional array object or ndarray is an important feature of NumPy. This is a fast and flexible container for huge data sets in Python.

Arrays allow us to perform mathematical operations on entire blocks of data using similar syntax to the corresponding operations between scalar elements:

In [8]: data


array ([[ 0.9526, -0.246 , -0.8856],

[ 0.5639, 0.2379, 0.9104]])


In [9]: data * 10 In [10]: data + data

Out[9]: Out[10]:

array ([[ 9.5256, -2.4601, -8.8565], array([[ 1.9051, -0.492 , -1.7713],

[5.6385, 2.3794, 9.104]]) [1.1277, 0.4759, 1.8208]])

For homogeneous data, a ndarray is a generic multidimensional container. All of the elements must be the same type for that container. All arrays have a shape. For example, a tuple representing the size of each dimension, and a dtype, an object describing the data type of the array:

In [11]: data.shape

Out [11]: (2, 3)

In [12]: data.dtype

Out [12]: dtype ('float64')

Becoming proficient in array-oriented programming and thinking is a key step along the way to becoming a scientific Python guru, although it’s not necessary to have a deep understanding of NumPy for many data analytical applications. Every time we see “array”, “NumPy array”, or “ndarray” in the text, with few exceptions they all refer to the same thing: the ndarray object.

Creating ndarrays

The stress-free way to create an array is to use the array function. This including other arrays accepts any sequence-like object () and produces a new NumPy array containing the past data. For instance, a list is a good candidate for conversion:

In [13]: data1 = [6, 7.5, 8, 0, 1]

In [14]: arr1 = np.array(data1)

In [15]: arr1

Out[15]: array([ 6. , 7.5, 8. , 0. , 1. ])

Nested sequences, like a list of equal-length lists, will be converted into a multidimensional array:


In [16]: data2 = [[1, 2, 3, 4], [5, 6, 7, 8]]

In [17]: arr2 = np.array (data2)

In [18]: arr2


array([[1, 2, 3, 4],

[5, 6, 7, 8]])

In [19]: arr2.ndim

Out[19]: 2

In [20]: arr2.shape

Out[20]: (2, 4)
Except clearly stated np.array tries to infer a good data type for the array that it creates. The data type is stored in a special dtype object; for instance, in the above two examples we have:
In [21]: arr1.dtype

Out[21]: dtype('float64')

In [22]: arr2.dtype

Out[22]: dtype('int64')

Furthermore to np.array, there are a number of other functions for creating new arrays.

The best examples are zeros and ones that create arrays of 0’s or 1’s, respectively, with a given length or shape empty creates an array without initializing its values to any particular value. Pass a tuple for the shape to create a higher dimensional array with these methods:


In [23]: np.zeros (10)

Out[23]: array([ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])

In [24]: np.zeros((3, 6))


array ([[ 0., 0., 0., 0., 0., 0.],

[ 0., 0., 0., 0., 0., 0.],

[ 0., 0., 0., 0., 0., 0.]])

In [25]: np.empty((2, 3, 2))


array ([[[ 4.94065646e-324, 4.94065646e-324],

[ 3.87491056e-297, 2.46845796e-130],

[ 4.94065646e-324, 4.94065646e-324]],

[[ 1.90723115e+083, 5.73293533e-053],

[ -2.33568637e+124, -6.70608105e-012],

[ 4.42786966e+160, 1.27100354e+025]]])

This is not mild to assume that np. empty will return an array of all zeros. In many cases, as previously shown, it will return uninitialized garbage values arranged in an array-valued version of the built-in Python range function:

In [26]: np.arange(15)

Out [26]: array ([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14])

Python Data Types for ndarrays

The data type or dtype is a special object containing the information the ndarray needs to understand a mass of memory as a particular type of data:

In [27]: arr1 = np.array ([1, 2, 3], dtype=np.float64)

In [28]: arr2 = np.array ([1, 2, 3], dtype=np.int32)

In [29]: arr1.dtype In [30]: arr2.dtype

Out [29]: dtype (‘float64’) Out[30]: dtype (‘int32’)

Dtypes are part of what make NumPy so powerful and flexible. They map directly onto an underlying machine representation in most cases that making it easy to read and write binary streams of data to disk and also to connect to code written in a low-level

language like C or FORTRAN. The numerical dtypes are named the same way: a type name, like float or integer, followed by a number indicating the number of bits per element. A standard double-precision floating-point value takes up 8 bytes or 64 bits. Therefore, this type is known in NumPy as float64.

Don’t be concerned about memorizing the NumPy dtypes, especially if we’re a new user. It’s often only necessary to care about the general kind of data we’re dealing with, whether floating-point, complex, integer, Boolean, string, or general Python object. When we need more control over how data are stored in memory and on disk, especially large data sets, it is good to know that we have control over the storage type.


Mansoor Ahmed

Mansoor Ahmed is Chemical Engineer, web developer, a Tech writer currently living in Pakistan. My interests range from technology to web development. I am also interested in programming, writing, and reading.