Best CPU for large data - Intel i9900k or AMD Threadripper 2950x

I work with large datasets (> 2TB) and have matrices often larger than 1,000,000 x 1,000,000 rows/cols. Computation times are on the order of weeks, and the time it take me to create plot animation (gifs) are on the order of hours. I am building a PC that will be able to do computations and plotting (contouring and texture mapping primarily) much faster. I use lots of for loops and if statements since most of the operations I do are heavily dependent on the previous iteration. Some functions that appear frequently in my script are: permute, reshape, regionsprops, labelmatrix, concatenation, indexing, max/mean/sum. I have the paralel computing toolbox and will use GPU computing (GTX 1070) for simple, but long operations. The PC will also have 64GB ram.
My question is, besides optimizing the script, what CPU is most optimized for heavy computations in MATLAB? 16 cores and 32 threads with the Threadripper 2950x but only a 4.4GHz clock speed, or 8 cores and 16 threads with up to 5GHz clock speed for the i9-9900k? MATLAB has very little documentation on selecting hardware and I am looking for any advice. Has MATLAB become better optimized for AMD, or should I stick with Intel? Also, are there any benchmarks from MATLAB that compare CPUs for different operations with different toolboxes? I use 2017a, would upgrading to latest release benefit me if I chose AMD over Intel?

7 Commenti

If you run the bench function you can see some results for different systems. It does look like there is a core count advantage, but for your specific functions I wouldn't know. The LU test seems to be more clock speed bound.
Maybe you should try contacting support directly with the Contact Us button at the top of this page.
If the iterative code is vectorized then basic linear algebra including plain addition of vectors will be sent to high performance routines that split the task between threads. If the thread count rises too high relative to the array size then that gets less effective . The peak efficiency can turn out to be on the order of 8 cores but perhaps more with sufficiently large arrays .
If the iterative parts are not vectorized then the two constraints are clock rate and core memory bandwidth because the work will be done by single . Unless the problem can be split into somewhat independent chunks that is.
Thanks for the responses. I have run the bench function, but it shows a very limited range of CPUs. The problem is much of my code is reliant on previous iterations within the loop and I can't put it in a parfor. Furthermore, my data array is roughly 45GB single precision, so even simpler operations are on the longer side. Matlab uses most of the cores fairly well with my current i7 processor, but I was wanting to know if more cores would help even more. AMD offers more cores for the money, so perhaps a better question is: if an AMD CPU and Intel CPU both had 8 cores/16 threads and same clock speed of 4.4 GHz, which processor would complete the same theoretical script quicker? I've heard Matlab isn't optimized for AMD CPUs, but didn't know if recent versions of Matlab have corrected this.
OCDER
OCDER il 11 Feb 2019
Modificato: OCDER il 11 Feb 2019
"Some functions that appear frequently in my script are: permute, reshape, regionsprops, labelmatrix, concatenation, indexing, max/mean/sum."
In my experience, permute, reshape, and concatenation can be pretty slow in large iteration, so finding a workaround would be better.
"what CPU is most optimized for heavy computations in MATLAB?"
I feel like we kind of reached the peak in performance of CPU clock speed, and limitations lie in the hard drive and memory bandwidth. Having more cores won't help you unless you can use parfor.
Instead of buying more CPUs, perhaps use profiler to see which step is the slowest and rewrite the code OR use MEX routines. These can offer considerable speed boost over hardware.
"large datasets (> 2TB) and have matrices often larger than 1,000,000 x 1,000,000 rows/cols."
If so, your 64 GB RAM will not be able to load a 1E6 x 1E6 matrix, which is ~ 8000 GB. (8 byte/element) * (1e12 element) * (GB/1E9 byte) = 8000 GB.
This means you'll have to access your hard drive a lot and load data in chunks, which is probably what your for loop is already doing. In that case, investing in a better SSD would be the better way to spend your money.
Thanks for the response OCDER. I am using a large HDD (total data > 10TB), then loading very small chunks (roughly 30GB) into a Samsung 970 pro. I was looking for any other way to rapidly improve computational speed without having to invest in a time slot at a super computer. I will try and optimize the code as best as possible, but I think everyones reponses have helped me decide against a cpu upgrade. Thanks again for the comments!
The sites I find imply that the i9 typically beats the 2950 per core. The second one I linked to has a higher aggregate score for the AMD but based on twice as many cores.
Thanks, both articles were very helpful!

Accedi per commentare.

Risposte (0)

Richiesto:

il 10 Feb 2019

Commentato:

il 11 Feb 2019

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by