Python: Fast way of finding the mean of large 4D-arrays filled with mostly zeros -
i have many large 4d arrays take average of. these arrays filled zeros (>99%), each array has non-zero values in different locations. each array has corresponding array weights taking average.
taking average of arrays in straightforward way (below) takes long time compute, , results in memory error me.
>>> a, b, weights_a, weights_b = [np.zeros((150,150,150,150)) in range(4)] >>> valinds_a = np.random.randint(0,a.size,7000) >>> valinds_b = np.random.randint(0,b.size,7000) >>> a.ravel()[valinds_a] = np.random.random(7000) >>> weights_a.ravel()[valinds_a] = np.random.random(7000) >>> b.ravel()[valinds_b] = np.random.random(7000) >>> weights_b.ravel()[valinds_b] = np.random.random(7000) >>> avg = np.average([a,b],0, weights = [weights_a,weights_b])
i looking faster way computer mean. thinking there way, since of values zeros. looked using sparse arrays, no support arrays more 2 dimensions.
one way take average either of a
or b
non-zero, since know average 0 otherwise.
if have access valinds_a
, valinds_b
done like:
valinds_both = np.union1d(valinds_a, valinds_b) avg = np.zeros_like(a) avg.ravel()[valinds_both] = np.average( [a.ravel()[valinds_both], b.ravel()[valinds_both]], axis=0, weights=[weights_a.ravel()[valinds_both], weights_b.ravel()[valinds_both]])
Comments
Post a Comment