# Questions tagged [numpy]

NumPy is an extension of the Python language that adds support to large multidimensional arrays and matrixes, along with a large library of high-level mathematical functions for operations with these arrays.

11,685
questions

**404**

votes

**11**answers

597k views

### Pandas conditional creation of a series/dataframe column

I have a dataframe along the lines of the below:
Type Set
1 A Z
2 B Z
3 B X
4 C Y
I want to add another column to the dataframe (...

**288**

votes

**26**answers

255k views

### Split (explode) pandas dataframe string entry to separate rows

I have a pandas dataframe in which one column of text strings contains comma-separated values. I want to split each CSV field and create a new row per entry (assume that CSV are clean and need only be ...

**295**

votes

**7**answers

1.0m views

### ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

I just discovered a logical bug in my code which was causing all sorts of problems. I was inadvertently doing a bitwise AND instead of a logical AND.
I changed the code from:
r = mlab.csv2rec(...

**484**

votes

**17**answers

584k views

### How to find all occurrences of an element in a list

index() will give the first occurrence of an item in a list. Is there a neat trick which returns all indices in a list for an element?

**171**

votes

**15**answers

81k views

### Cartesian product of x and y array points into single array of 2D points

I have two numpy arrays that define the x and y axes of a grid. For example:
x = numpy.array([1,2,3])
y = numpy.array([4,5])
I'd like to generate the Cartesian product of these arrays to generate:
...

**224**

votes

**20**answers

131k views

### Find unique rows in numpy.array

I need to find unique rows in a numpy.array.
For example:
>>> a # I have
array([[1, 1, 1, 0, 0, 0],
[0, 1, 1, 1, 0, 0],
[0, 1, 1, 1, 0, 0],
[1, 1, 1, 0, 0, 0],
[...

**138**

votes

**9**answers

82k views

### How do you fix "runtimeError: package fails to pass a sanity check" for numpy and pandas?

This is the error I am getting and, as far as I can tell, there is nothing useful on the error link to fix this.
RuntimeError: The current Numpy installation
('...\\venv\\lib\\site-packages\\numpy\\...

**514**

votes

**5**answers

209k views

### What are the advantages of NumPy over regular Python lists?

What are the advantages of NumPy over regular Python lists?
I have approximately 100 financial markets series, and I am going to create a cube array of 100x100x100 = 1 million cells. I will be ...

**14**

votes

**2**answers

3k views

### Python: Justifying NumPy array

Please I am a bit new to Python and it has been nice, I could comment that python is very sexy till I needed to shift content of a 4x4 matrix which I want to use in building a 2048 game demo of the ...

**504**

votes

**11**answers

692k views

### Most efficient way to map function over numpy array

What is the most efficient way to map a function over a numpy array? The way I've been doing it in my current project is as follows:
import numpy as np
x = np.array([1, 2, 3, 4, 5])
# Obtain array ...

**601**

votes

**15**answers

1.3m views

### Convert pandas dataframe to NumPy array

I am interested in knowing how to convert a pandas dataframe into a NumPy array.
dataframe:
import numpy as np
import pandas as pd
index = [1, 2, 3, 4, 5, 6, 7]
a = [np.nan, np.nan, np.nan, 0.1, 0....

**175**

votes

**10**answers

179k views

### Using numpy to build an array of all combinations of two arrays

I'm trying to run over the parameters space of a 6 parameter function to study its numerical behavior before trying to do anything complex with it, so I'm searching for an efficient way to do this.
My ...

**137**

votes

**2**answers

161k views

### Binning a column with Python Pandas

I have a data frame column with numeric values:
df['percentage'].head()
46.5
44.2
100.0
42.12
I want to see the column as bin counts:
bins = [0, 1, 5, 10, 25, 50, 100]
How can I get the result as ...

**663**

votes

**23**answers

1.1m views

### How can the Euclidean distance be calculated with NumPy?

I have two points in 3D:
(xa, ya, za)
(xb, yb, zb)
And I want to calculate the distance:
dist = sqrt((xa-xb)^2 + (ya-yb)^2 + (za-zb)^2)
What's the best way to do this with NumPy, or with Python in ...

**28**

votes

**6**answers

12k views

### Find the row indexes of several values in a numpy array

I have an array X:
X = np.array([[4, 2],
[9, 3],
[8, 5],
[3, 3],
[5, 6]])
And I wish to find the index of the row of several values in ...

**367**

votes

**13**answers

566k views

### Converting between datetime, Timestamp and datetime64

How do I convert a numpy.datetime64 object to a datetime.datetime (or Timestamp)?
In the following code, I create a datetime, timestamp and datetime64 objects.
import datetime
import numpy as np
...

**36**

votes

**3**answers

13k views

### Taking subarrays from numpy array with given stride/stepsize

Lets say I have a Python Numpy array a.
a = numpy.array([1,2,3,4,5,6,7,8,9,10,11])
I want to create a matrix of sub sequences from this array of length 5 with stride 3. The results matrix hence will ...

**374**

votes

**7**answers

172k views

### Difference between numpy.array shape (R, 1) and (R,)

In numpy, some of the operations return in shape (R, 1) but some return (R,). This will make matrix multiplication more tedious since explicit reshape is required. For example, given a matrix M, if we ...

**637**

votes

**20**answers

524k views

### How do I get indices of N maximum values in a NumPy array?

NumPy proposes a way to get the index of the maximum value of an array via np.argmax.
I would like a similar thing, but returning the indexes of the N maximum values.
For instance, if I have an ...

**121**

votes

**5**answers

83k views

### Use numpy array in shared memory for multiprocessing

I would like to use a numpy array in shared memory for use with the multiprocessing module. The difficulty is using it like a numpy array, and not just as a ctypes array.
from multiprocessing import ...

**61**

votes

**3**answers

18k views

### Performant cartesian product (CROSS JOIN) with pandas

The contents of this post were originally meant to be a part of
Pandas Merging 101,
but due to the nature and size of the content required to fully do
justice to this topic, it has been moved to ...

**387**

votes

**14**answers

483k views

### How to pretty-print a numpy.array without scientific notation and with given precision?

I'm curious, whether there is any way to print formatted numpy.arrays, e.g., in a way similar to this:
x = 1.23456
print '%.3f' % x
If I want to print the numpy.array of floats, it prints several ...

**542**

votes

**17**answers

808k views

### Is there a NumPy function to return the first index of something in an array?

I know there is a method for a Python list to return the first index of something:
>>> l = [1, 2, 3]
>>> l.index(2)
1
Is there something like that for NumPy arrays?

**465**

votes

**7**answers

821k views

### pandas create new column based on values from other columns / apply a function of multiple columns, row-wise

I want to apply my custom function (it uses an if-else ladder) to these six columns (ERI_Hispanic, ERI_AmerInd_AKNatv, ERI_Asian, ERI_Black_Afr.Amer, ERI_HI_PacIsl, ERI_White) in each row of my ...

**487**

votes

**13**answers

929k views

### How do I read CSV data into a record array in NumPy?

I wonder if there is a direct way to import the contents of a CSV file into a record array, much in the way that R's read.table(), read.delim(), and read.csv() family imports data to R's data frame?
...

**422**

votes

**15**answers

397k views

### Sorting arrays in NumPy by column

How can I sort an array in NumPy by the nth column?
For example,
a = array([[9, 2, 3],
[4, 5, 6],
[7, 0, 5]])
I'd like to sort rows by the second column, such that I get back:
...

**164**

votes

**10**answers

163k views

### Fitting empirical distribution to theoretical ones with Scipy (Python)?

INTRODUCTION: I have a list of more than 30,000 integer values ranging from 0 to 47, inclusive, e.g.[0,0,0,0,..,1,1,1,1,...,2,2,2,2,...,47,47,47,...] sampled from some continuous distribution. The ...

**119**

votes

**5**answers

49k views

### What is the difference between size and count in pandas?

That is the difference between groupby("x").count and groupby("x").size in pandas ?
Does size just exclude nil ?

**726**

votes

**21**answers

775k views

### How to print the full NumPy array, without truncation?

When I print a numpy array, I get a truncated representation, but I want the full array.
Is there any way to do this?
Examples:
>>> numpy.arange(10000)
array([ 0, 1, 2, ..., 9997, ...

**189**

votes

**9**answers

81k views

### NumPy or Pandas: Keeping array type as integer while having a NaN value

Is there a preferred way to keep the data type of a numpy array fixed as int (or int64 or whatever), while still having an element inside listed as numpy.NaN?
In particular, I am converting an in-...

**239**

votes

**8**answers

1.2m views

### ValueError: setting an array element with a sequence

This Python code:
import numpy as p
def firstfunction():
UnFilteredDuringExSummaryOfMeansArray = []
MeanOutputHeader=['TestID','ConditionName','FilterType','RRMean','HRMean',
...

**170**

votes

**8**answers

112k views

### 'and' (boolean) vs '&' (bitwise) - Why difference in behavior with lists vs numpy arrays?

What explains the difference in behavior of boolean and bitwise operations on lists vs NumPy arrays?
I'm confused about the appropriate use of & vs and in Python, illustrated in the following ...

**117**

votes

**8**answers

56k views

### Numpy `logical_or` for more than two arguments

Numpy's logical_or function takes no more than two arrays to compare. How can I find the union of more than two arrays? (The same question could be asked with regard to Numpy's logical_and and ...

**24**

votes

**4**answers

7k views

### Indexing one array by another in numpy

Suppose I have a matrix A with some arbitrary values:
array([[ 2, 4, 5, 3],
[ 1, 6, 8, 9],
[ 8, 7, 0, 2]])
And a matrix B which contains indices of elements in A:
array([[0, 0, 1, 2],
...

**233**

votes

**29**answers

474k views

### Moving average or running mean

Is there a SciPy function or NumPy function or module for Python that calculates the running mean of a 1D array given a specific window?

**396**

votes

**5**answers

168k views

### What are the differences between numpy arrays and matrices? Which one should I use?

What are the advantages and disadvantages of each?
From what I've seen, either one can work as a replacement for the other if need be, so should I bother using both or should I stick to just one of ...

**73**

votes

**3**answers

17k views

### Fast punctuation removal with pandas

This is a self-answered post. Below I outline a common problem in the NLP domain and propose a few performant methods to solve it.
Oftentimes the need arises to remove punctuation during text ...

**410**

votes

**3**answers

242k views

### Simple Digit Recognition OCR in OpenCV-Python

I am trying to implement a "Digit Recognition OCR" in OpenCV-Python (cv2). It is just for learning purposes. I would like to learn both KNearest and SVM features in OpenCV.
I have 100 samples (i.e. ...

**264**

votes

**8**answers

93k views

### Understanding NumPy's einsum

I'm struggling to understand exactly how einsum works. I've looked at the documentation and a few examples, but it's not seeming to stick.
Here's an example we went over in class:
C = np.einsum("...

**400**

votes

**19**answers

391k views

### Find nearest value in numpy array

Is there a numpy-thonic way, e.g. function, to find the nearest value in an array?
Example:
np.find_nearest( array, value )

**7**

votes

**2**answers

1k views

### using an numpy array as indices of the 2nd dim of another array? [duplicate]

For example, I have two numpy arrays,
A = np.array(
[[0,1],
[2,3],
[4,5]])
B = np.array(
[[1],
[0],
[1]], dtype='int')
and I want to extract one element from each row of A, and ...

**661**

votes

**11**answers

925k views

### Dump a NumPy array into a csv file

Is there a way to dump a NumPy array into a CSV file? I have a 2D NumPy array and need to dump it in human-readable format.

**193**

votes

**9**answers

148k views

### How to split data into 3 sets (train, validation and test)?

I have a pandas dataframe and I wish to divide it to 3 separate sets. I know that using train_test_split from sklearn.cross_validation, one can divide the data in two sets (train and test). However, I ...

**333**

votes

**20**answers

694k views

### Saving a Numpy array as an image

I have a matrix in the type of a Numpy array. How would I write it to disk it as an image? Any format works (png, jpeg, bmp...). One important constraint is that PIL is not present.

**4**

votes

**1**answer

3k views

### Efficiently Using Multiple Numpy Slices for Random Image Cropping

I have a 4-D numpy array, with the first dimension representing the number of images in a data set, the second and third being the (equal) width and height, and the 4th being the number of channels (3)...

**371**

votes

**17**answers

621k views

### How to add an extra column to a NumPy array

Let’s say I have a NumPy array, a:
a = np.array([
[1, 2, 3],
[2, 3, 4]
])
And I would like to add a column of zeros to get an array, b:
b = np.array([
[1, 2, 3, 0],
[2, 3, 4, 0]
...

**167**

votes

**16**answers

276k views

### How to calculate rolling / moving average using python + NumPy / SciPy?

There seems to be no function that simply calculates the moving average on numpy/scipy, leading to convoluted solutions.
My question is two-fold:
What's the easiest way to (correctly) implement a ...

**235**

votes

**16**answers

457k views

### Transposing a 1D NumPy array

I use Python and NumPy and have some problems with "transpose":
import numpy as np
a = np.array([5,4])
print(a)
print(a.T)
Invoking a.T is not transposing the array. If a is for example [[],[]] ...

**120**

votes

**5**answers

105k views

### What is the difference between NaN and None?

I am reading two columns of a csv file using pandas readcsv() and then assigning the values to a dictionary. The columns contain strings of numbers and letters. Occasionally there are cases where a ...

**146**

votes

**7**answers

407k views

### Unable to allocate array with shape and data type

I'm facing an issue with allocating huge arrays in numpy on Ubuntu 18 while not facing the same issue on MacOS.
I am trying to allocate memory for a numpy array with shape (156816, 36, 53806)
with
...