How much additional memory is needed to perfrom a 3D FFT other than matrix to be transformed? GPU application.

Question

Nathan Zechar il 10 Mag 2020

0
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/524477-how-much-additional-memory-is-needed-to-perfrom-a-3d-fft-other-than-matrix-to-be-transformed-gpu-ap

Commentato: Nathan Zechar il 15 Mag 2020

Hello. I'm trying to gain an understanding of how much memory is needed to perform an FFT, and if it is different with respect to performing it on a GPU.

For instance, it appears I can only utilize up to 67% of my GPU memory before an error is thrown. I can't seem to go above this value

clear all
Nx = 256;
Ny = 256;
Nz = 512;
A = rand(Nx,Ny,Nz)+1i*rand(Nx,Ny,Nz);
A = gpuArray(A);
A = fftn(A);
B = rand(Nx,Ny,Nz)+1i*rand(Nx,Ny,Nz);
B = gpuArray(B);
B = fftn(B);
C = rand(Nx,Ny,Nz)+1i*rand(Nx,Ny,Nz);
C = gpuArray(C);
C = fftn(C);
D = rand(Nx,Ny,Nz)+1i*rand(Nx,Ny,Nz);
D = gpuArray(D);
D = fftn(D);
E = rand(Nx,Ny,Nz)+1i*rand(Nx,Ny,Nz);
E = gpuArray(E);
E = fftn(E);
F = rand(Nx,Ny,Nz)+1i*rand(Nx,Ny,Nz);
F = gpuArray(F);
F = fftn(F);
G = rand(Nx,Ny,Nz)+1i*rand(Nx,Ny,Nz);
G = gpuArray(G);
G = fftn(G);
H = rand(Nx,Ny,Nz)+1i*rand(Nx,Ny,Nz);
H = gpuArray(H);
H = fftn(H);
I = rand(Nx,Ny,Nz)+1i*rand(Nx,Ny,Nz);
I = gpuArray(I);
I = fftn(I);
J = rand(Nx,Ny,Nz)+1i*rand(Nx,Ny,Nz);
J = gpuArray(J);
J = fftn(J);
bytes    = 16;            % Bytes used for complex number
Tbytes   = Nx*Ny*Nz*bits; % Total number of Bytes
NoTran   = 10;            % Number of FFT transforms in memory
GPUmem   = 8e9;           % 8 GBytes of GPU memory
% Theoretical percentage of GPU memory used with all transforms
percent = (Tbytes/GPUmem)*NoTran*100; 
ans = 67.1089

If I add another matrix, let's say 'K' in the same way the other matricies were contructed, an error is then thrown.

If call the GPU it appears I obtain a different answer than my calculation

 gpuDevice
ans = 
  CUDADevice with properties:
                      Name: 'GeForce RTX 2070 with Max-Q Design'
                     Index: 1
         ComputeCapability: '7.5'
            SupportsDouble: 1
             DriverVersion: 1.0200e+01
            ToolkitVersion: 1.0100e+01
        MaxThreadsPerBlock: 1024
          MaxShmemPerBlock: 49152
        MaxThreadBlockSize: [1024 1024 64]
               MaxGridSize: [2.1475e+09 65535 65535]
                 SIMDWidth: 32
               TotalMemory: 8.5899e+09
           AvailableMemory: 1.5127e+09
           
TotalMemory      = 8.5899e+09
AvailableMemory  = 1.5127e+09
% Percentage of GPU memory used
percent          = (1 - AvailableMemory/TotalMemory)*100
ans = 82.390

This answer is somewhat confusing as I made sure to only enable my computer's integrated graphics rather than the GPU. Making changes to this setting in NVIDIA control panel does not appear to change 'AvailableMemory' if I rerun all the matrices and check available memory.

So my calculation for 'Tbytes' is wrong as it appears more memory is being used. Additionally, it appears there are 8.6 GBytes of total memory available on the GPU - I'm not going to complain about that.

So, how much additional memory is needed to perform a 3D FFT in matlab other than the starting matrix, and does performing one on a GPU make a difference?

That is, for some matrix A consisting of comlex numbers and of size (Nx*Ny*Nz) - Theoretically it should require (Nx*Ny*Nz)*16 bytes of memory. However in order to do a 3D FFT on that matrix, I believe it should require at least double that amount of memory when considering the transform matrix (including the zeros of that transform matrix). But it seems even more memory than that is required.

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Accedi per rispondere a questa domanda.

Answer 1

Hamza Butt il 13 Mag 2020

1
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/524477-how-much-additional-memory-is-needed-to-perfrom-a-3d-fft-other-than-matrix-to-be-transformed-gpu-ap#answer_432176

Modificato: Hamza Butt il 13 Mag 2020

Hi Nathan,

There are some additional memory requirements to consider while performing FFT operations:

The operation is out-of-place, which means that a copy of the matrix is made for storing the result.
The FFT plan required to execute the operation can vary in size depending on the properties of the input data. For more information on cuFFT plans, please see: https://docs.nvidia.com/cuda/cufft/index.html#cufft-setup
During the execution of the FFT, a temporary workspace is required, the size of which depends on the algorithm chosen in the FFT plan. For data with dimensions that are a power of twos, cuFFT requires a smaller workspace memory. This increases if the data dimensions are a factor of larger primes, where cuFFT resorts to other algorithms that can possibly use more workspace memory than the input data itself.
MATLAB also loads CUDA libraries, which may use up their own memory on initialization.

While you have accounted for (1), the memory requirement for (2) and (3) can be difficult to estimate, as they rely on the internals of cuFFT. (4) will be constant for every MATLAB instance that uses gpuArrays.

In response to your question about the total available memory being 8.5899e+09 bytes: if you have CUDA installed, you can run "nvidia-smi" or "nvidia-smi --query-gpu=memory.total --format=csv" where you will find the total memory in "MiB". Note that "MiB" and "MB" are not the same, and for the case of the RTX 2070 Max-Q (and my RTX 2080 Max-Q), "8192MiB" translates to the value you are seeing in bytes.

I hope this answers your question. Please let me know if you would like any further clarifications.

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Nathan Zechar il 15 Mag 2020

Thank you so much for taking the time to answer my question. It is very appreciated!

Accedi per commentare.

How much additional memory is needed to perfrom a 3D FFT other than matrix to be transformed? GPU application.

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposta accettata

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Più risposte (0)

Vedere anche

Categorie

Tag

Prodotti

Community Treasure Hunt

How much additional memory is needed to perfrom a 3D FFT other than matrix to be transformed? GPU application.

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposta accettata

1 Commento Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Più risposte (0)

Vedere anche

Categorie

Tag

Prodotti

Community Treasure Hunt

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti