• 7
name

A PHP Error was encountered

Severity: Notice

Message: Undefined index: userid

Filename: views/question.php

Line Number: 191

Backtrace:

File: /home/prodcxja/public_html/questions/application/views/question.php
Line: 191
Function: _error_handler

File: /home/prodcxja/public_html/questions/application/controllers/Questions.php
Line: 433
Function: view

File: /home/prodcxja/public_html/questions/index.php
Line: 315
Function: require_once

name Punditsdkoslkdosdkoskdo

In-place type conversion of a NumPy array

Given a NumPy array of int32, how do I convert it to float32 in place? So basically, I would like to do

a = a.astype(numpy.float32)

without copying the array. It is big.

The reason for doing this is that I have two algorithms for the computation of a. One of them returns an array of int32, the other returns an array of float32 (and this is inherent to the two different algorithms). All further computations assume that a is an array of float32.

Currently I do the conversion in a C function called via ctypes. Is there a way to do this in Python?

Update: This function only avoids copy if it can, hence this is not the correct answer for this question. unutbu's answer is the right one.


a = a.astype(numpy.float32, copy=False)

numpy astype has a copy flag. Why shouldn't we use it ?

  • 158
Reply Report
      • 1
    • The copy flag only says that if the change can be done without a copy, it will be done without a copy. However it the type is different it will still always copy.
    • Once this parameter is supported in a NumPy release, we could of course use it, but currently it's only available in the development branch. And at the time I asked this question, it didn't exist at all.

You can make a view with a different dtype, and then copy in-place into the view:

import numpy as np
x = np.arange(10, dtype='int32')
y = x.view('float32')
y[:] = x

print(y)

yields

array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9.], dtype=float32)

To show the conversion was in-place, note that copying from x to y altered x:

print(x)

prints

array([         0, 1065353216, 1073741824, 1077936128, 1082130432,
       1084227584, 1086324736, 1088421888, 1090519040, 1091567616])
  • 110
Reply Report
      • 2
    • Note for those (like me) that want conversion between dtype of different byte-size (e.g. 32 to 16 bits): This method fails because y.size <> x.size. Logical once you think about it :-(
    • Was this solution working for some older version of Numpy? When I do np.arange(10, dtype=np.int32).view(np.float32) on Numpy 1.8.2, I get array([ 0.00000000e+00, 1.40129846e-45, ... [snip] ... 1.26116862e-44], dtype=float32).
      • 2
    • to clarify the point made about the itemsize (number of bits) referred to by the original answer and @Juh_ e.g.: a = np.arange(10, dtype='float32'); b = a[::-1]; c = np.vstack((a,b)); d = c.view('float64') This code takes 10 + 10 float32 and results in 10, rather than 20 float64
      • 2
    • This in-place change may save on memory use, but it is slower than a simple x.astype(float) conversion. I wouldn't recommend it unless your script is bordering on MemoryError.

You can change the array type without converting like this:

a.dtype = numpy.float32

but first you have to change all the integers to something that will be interpreted as the corresponding float. A very slow way to do this would be to use python's struct module like this:

def toi(i):
    return struct.unpack('i',struct.pack('f',float(i)))[0]

...applied to each member of your array.

But perhaps a faster way would be to utilize numpy's ctypeslib tools (which I am unfamiliar with)

- edit -

Since ctypeslib doesnt seem to work, then I would proceed with the conversion with the typical numpy.astype method, but proceed in block sizes that are within your memory limits:

a[0:10000] = a[0:10000].astype('float32').view('int32')

...then change the dtype when done.

Here is a function that accomplishes the task for any compatible dtypes (only works for dtypes with same-sized items) and handles arbitrarily-shaped arrays with user-control over block size:

import numpy

def astype_inplace(a, dtype, blocksize=10000):
    oldtype = a.dtype
    newtype = numpy.dtype(dtype)
    assert oldtype.itemsize is newtype.itemsize
    for idx in xrange(0, a.size, blocksize):
        a.flat[idx:idx + blocksize] = \
            a.flat[idx:idx + blocksize].astype(newtype).view(oldtype)
    a.dtype = newtype

a = numpy.random.randint(100,size=100).reshape((10,10))
print a
astype_inplace(a, 'float32')
print a
  • 14
Reply Report
    • Thanks for your answer. Honestly, I don't think this is very useful for big arrays -- it is way too slow. Reinterpreting the data of the array as a different type is easy -- for example by calling a.view(numpy.float32). The hard part is actually converting the data. numpy.ctypeslib only helps with reinterpreting the data, not with actually converting it.
      • 2
    • Thanks for the update. Doing it blockwise is a good idea -- probably the best you can get with the current NumPy interface. But in this case, I will probably stick to my current ctypes solution.
import numpy as np
arr_float = np.arange(10, dtype=np.float32)
arr_int = arr_float.view(np.float32)

use view() and parameter 'dtype' to change the array in place.

  • -1
Reply Report
      • 2
    • The goal of the question was to actually convert the data in place. After correcting the type in the last line to int, this answer would only reinterpret the existing data as a different type, which isn't what I was asking for.
    • what do you mean? dtype is just the appearance of data in memory, it really workes.However in np.astype, parameter 'casting' can control convert method default 'unsafe'.
    • Yeah, I agree with the first accepted answer. However arr_.astype(new_dtype, copy=False) still returns a newly allocated array. How to satisfied the dtype, order, and subok requirements to return a copy of array? I don't solve it.

Use this:

In [105]: a
Out[105]: 
array([[15, 30, 88, 31, 33],
       [53, 38, 54, 47, 56],
       [67,  2, 74, 10, 16],
       [86, 33, 15, 51, 32],
       [32, 47, 76, 15, 81]], dtype=int32)

In [106]: float32(a)
Out[106]: 
array([[ 15.,  30.,  88.,  31.,  33.],
       [ 53.,  38.,  54.,  47.,  56.],
       [ 67.,   2.,  74.,  10.,  16.],
       [ 86.,  33.,  15.,  51.,  32.],
       [ 32.,  47.,  76.,  15.,  81.]], dtype=float32)
  • -5
Reply Report

a = np.subtract(a, 0., dtype=np.float32)

  • -5
Reply Report
      • 1
    • Why should this be an in place conversion? numpy.subtract is returning a copy, isn't it? Only the name a reused for another chunk of data... Please explain, if I am wrong about this.
      • 2
    • While this code snippet may be the solution, including an explanation really helps to improve the quality of your post. Remember that you are answering the question for readers in the future, and those people might not know the reasons for your code suggestion.

Trending Tags