3
\$\begingroup\$

I have a Numpy function that takes the values of an existing array X and a size (int) k equivalent to the columns of the array. This function does some calculations in a new array, to finally output the new array after the calculations are done.

def ff(x, k):
    newarr = np.zeros((x.shape[0], 1), dtype=np.uint32)
    newarr[:, 0] = x[:, 0] * 4
    for i in range(1, k - 1):
        newarr[:, 0] += x[:, i]
        newarr[:, 0] = newarr[:, 0] * 4
    newarr[:, 0] += x[:, -1]
    return newarr

newarr it's the array where the calculated data is stored, it's initialized with np.zeros and a dtype, and it has a shape of (y,1) where y is the length of the x array.

The first step is to store in the (only) column of the newarr the multiplication of the first column of the x by 4.

The iteration occurs over the columns 1 to k-1 of the original array (x). Inside the iteration, the next step is to sum the values of the first column of x to the column of newarr. Next, to multiply the column of newarr by 4 To finally, after the iteration is complete, sum the last column of the original array x to the column of newarr.

I'm looking for a way (if there's any) to avoid the creation of the newarr, and do the calculations in the original array. This', to reduce the memory usage of this function, mainly because the x tends to be very big in execution.

The x array looks like this.

array([[3, 3, 0, 1],
       [3, 0, 1, 0],
       [0, 1, 0, 2],
       [1, 0, 2, 2],
       [0, 2, 2, 3],
       [2, 2, 3, 1],
       [2, 3, 1, 3],
       .
       .
       .
       [2, 2, 0, 1]], dtype=uint8)

In this case, the k value would be 4.

For the sake of completeness, I've attached the memory profile of my solution and the proposed dot product, for a big x array: (248956413, 10).

My solution

Line #    Mem usage    Increment   Line Contents
================================================
18    322.0 MiB    322.0 MiB   @profile
19                             def ff(x, k):
20    322.0 MiB      0.0 MiB       newarr = np.zeros((x.shape[0], 1), dtype=np.uint32)
21   1271.7 MiB    949.8 MiB       newarr[:, 0] = x[:, 0] * 4
22   1274.2 MiB      0.0 MiB       for i in range(1, k - 1):
23   1274.2 MiB      0.0 MiB           newarr[:, 0] += x[:, i]
24   1274.2 MiB      2.5 MiB           newarr[:, 0] = newarr[:, 0] * 4
25   1274.2 MiB      0.0 MiB       newarr[:, 0] += x[:, -1]
26   1274.2 MiB      0.0 MiB       return newarr

Dot product.

Line #    Mem usage    Increment   Line Contents
================================================
18    321.7 MiB    321.7 MiB   @profile
19                             def ff2(x,k):
20   2221.2 MiB   1899.5 MiB       return x.dot(4 ** np.arange(k - 1, -1, -1))

I was expecting a reduction in memory usage. Clearly, I was wrong.

\$\endgroup\$
4
  • 2
    \$\begingroup\$ This looks like a simple matrix product, with a power series: x.dot(4**np.arange(k-1,-1,-1)) . newarr is only one column, or 1d, a fraction of the size of x (with k columns). I wouldn't worry about memory savings. \$\endgroup\$
    – hpaulj
    Commented Mar 20, 2019 at 3:20
  • \$\begingroup\$ @hpaulj Indeed was a dot product, thanks for that!. But for the sake of completeness, I've attached the memory profile of both functions. Weird at least! \$\endgroup\$
    – Kako
    Commented Mar 20, 2019 at 11:39
  • \$\begingroup\$ @hpaulj: Would you mind making your comment up into a short answer (maybe explain your thought process as well, or add some memory profiling as OP did) so that this question gets out of the large pool of unanswered questions? \$\endgroup\$
    – AlexV
    Commented Mar 29, 2019 at 22:57
  • \$\begingroup\$ The current question title, which states your concerns about the code, is too general to be useful here. Please edit to the site standard, which is for the title to simply state the task accomplished by the code. Please see How to get the best value out of Code Review: Asking Questions for guidance on writing good question titles. \$\endgroup\$ Commented Nov 21, 2024 at 9:54

1 Answer 1

4
\$\begingroup\$

The dot solution you showed is close, but is missing one critical thing (that I picked up during rudimentary unit testing). Comparing your old and new methods for apple-apple parity, this fails:

assert y.dtype == np.uint32

because the output of dot() was 64-bit, in turn because the output of arange defaulted to 64-bit. This exactly accounts for the 1899.5/949.8 = 1.99989x memory blowup. Pass dtype=np.uint32 and the rest is fine.

\$\endgroup\$

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.