CUDA Memory Transpose

Tasneem AbuQutaish

Estimable
Mar 20, 2015
6
0
4,510
Im applying matrix Transpose program on My PC with GTX850M, i used the transpose in this blog : http://devblogs.nvidia.com/parallelforall/efficient-matrix-transpose-cuda-cc/ But i want to implement huge sizes of matrix up to 20000 and 30000,.. but i get error out of memory,.. is it related to my GPU thread Space ? (its 1024 thread / block ) , what do u recommend i do to solve this problem ?
 
Solution


A 20,000 * 20,000 matrix has 400,000,000 elements. At 4 bytes per element (float32) that would occupy 1.6 gigabytes of memory alone.

You're running out of memory. Your options are to subdivide the problem into multiple submatricies which are transposed individually, or choose a smaller matrix size.

Pinhedd

Distinguished
Moderator


A 20,000 * 20,000 matrix has 400,000,000 elements. At 4 bytes per element (float32) that would occupy 1.6 gigabytes of memory alone.

You're running out of memory. Your options are to subdivide the problem into multiple submatricies which are transposed individually, or choose a smaller matrix size.
 
Solution