The Memory, RAM
RAM is also very important as applications, code, state and intermediate results must be stored somewhere. Over the years, commodity RAM has become cheap. The same is not true of high-end RAM used in servers, and this I did not know. When working with large data, having a lot of RAM is very useful because it prevents the operating system from having to write data to disk as swap. Reading and writing to/from a hard disk is extremely slow, and effort should be made in your code to avoid it if at all possible.
RAM has several metrics to take into consideration:
- Type, such as DDR2 and DDR3. DDR3 is recommended as it is faster and can be installed in larger amounts. DDR3 is not compatible with DDR2 or DDR.
- Speed. Common speeds for Xeons are 800 MT/s, 1066 MT/s and 1333 MT/s.
- Registered/buffered vs. unbuffered (RDIMM vs. UDIMM). Boards that support registered RAM have a register that sits between the RAM chip and the board’s memory controller which is used as a buffer. Unbuffered RAM does not have this register. Additionally, RDIMMs typically ship with ECC capabilities allowing real-time correction of memory errors.
Additionally, the system board and CPU impose some limitations. If these limitations are not considered, performance will be suboptimal, or your system may not even boot. DDR3 houses memory in three different channels (two different channels for DDR2). To minimize latency, the rule of thumb with DDR2 was to install memory chips in pairs. With DDR3, chips should be installed in triplets. Unlike older types such as RDRAM/Rambus, the system will function normally if the memory is not installed in pairs (DDR2) or triplets (DDR3), but there is a small performance penalty.
To add to the complication, unbuffered (UDIMM) and registered (RDIMM) RAM do not have a trivial relationship to each other. Unbuffered memory is said to be faster and is most commonly found in desktops. But this does not mean that Registered RAM (RDIMM) is slower. When installed en masse, RDIMMs provide faster and better performance than UDIMMs. The rule of thumb is that each memory channel must have at least two sticks of Registered RAM for optimal performance. Given that it is not possible to mix memory types, speeds, registered vs. unbuffered or sizes, this means that valid Registered RAM amounts in your system will be 3rs where s is the size of one stick of registered RAM (2, 4, 8, 16GB) and r > 1 is the number of triplets/”rows”. Registered RAM is more expensive than Unbuffered RAM.
The system board you purchase will contain the RAM specifications. It will specify what speed chips can be used on the board, what type, any size limitations, and whether or not it supports RDIMM, UDIMM or UDIMM with ECC.
My system has 24GB, 6x4GB Kingston DDR3-1333 RDIMM which should be enough for the near future.
So how much did this set you back?
About $2500, which isn’t too bad for me. A MacPro would have been close to $3000 for a low end. I was eyeing the MacPro…but I am kind of over Mac. I am still not settled on ALL of the hardware in this purchase. I’ve considered exchanging for a faster CPU clock speed, or a different motherboard, but we will see.
I kinda read this as “just buy top of the line everything except maybe the motherboard and you’ll be good.” Any experience with tradeoffs? How would you prioritize cpu vs ram vs disk io speed?
In my experience multiple cores don’t necessarily give a huge boost in HPC if you aren’t writing for it. I don’t tend to write multithreaded code because of the complexity and debugging issues that go with it. In that case, more cores won’t help my code finish faster; it’ll only help if there are multiple jobs running at once. Xeons are great if you’re switching between a boatload of threads like a web server might or you’re concerned about power/overheating, but in my experience they aren’t worth the extra cash you pay. In that situation, clock speed can be a lot more important than you give it credit for here.
Well, to be frank, this upgrade has been way overdue, and I did a bit of “throwing money” around. The purpose of the post was to document the decisions I made and what else was out there. Of course it will come across that way, because in an ideal world, if everyone could afford the top of the line hardware, we would all be good computing-wise. It would be too difficult to tailor this post to everyone’s needs.
In terms of priority, *for my research use* I put RAM as the most important, followed closely behind by the CPU and the number of threads it can run concurrently (not so much the clock speed), followed by disk I/O. I find that for my work investing in more or better RAM gives the best bang for the buck, followed by the CPU. Although a lot of my work is disk bound (crawling), faster disks are so much more expensive and are out of my budget. Because of this, I had to make the tradeoff to favor upgrading RAM over disks. At least there are some tricks that can be done with RAM to prevent overuse of disks.
To me “high performance computing” goes hand in hand with parallelism. Of course if the code has not been written to take advantage of the cores, the extra cores are useless. I would expect that if someone is buying a multicore processor, they intend to program to use the cores. For my research, not programming to take advantage of these extra cores would qualify as *not* high performance computing. I never suggested that CPUs with more cores will make things run faster; just that more work is done at a time. Hadoop is the most trivial example I can give where code takes advantage of multiple cores, OpenMP is another.
Everyone will have their own opinion, but for my use, based on experiences I have had using servers containing Xeons, and the issues I faced without a higher-end processor, the Xeons were the way to go without a question.
why wouldn’t you drop that money on several cheap computers, instead of one expensive one?
for $2500 you could have got 14 motherboards, 14 AMD Phenom quad-core CPUs and 14 x 1 GB ram
Sigh. I’ve heard that. While a cluster would be cool, I tend to use AWS when I need a cluster setup. Also just too much of a pain to have to build that many systems. I don’t spend money often, so I was ok with putting out a one time large purchase ;). This machine really serves two purposes: for development of jobs to be shipped to AWS (time = $), and to prevent me from having to use AWS for high-memory, high-speed applications.
Over the past 20 years though, I have collected a lot of old machines that might an ok cluster, just obviously not high end.
Heya
Is SATA3 Working on this motherboard?
I cant find any info saying that this Motherboard can support SATA3. Only Sata 2 in official notes.
Let me know if u got Sata 3 to work.
Thanks bye
I got D18 too, works like dream so far!
It may not support SATA III. Drives seem to lag far behind the interfaces though. Most drives can barely perform at SATA II interface speeds.
Btw forgot to say, U might be better off with EVGA Server motherboard- They can OC CPU = 12 cores x 4-5 ghz after OC…. That would be pretty much power…
I am having trouble receiving the D18. It seems to be a rare beast. It’s taken almost a month now. Considering cancelling. If I do I will probably go with the EVGA or Supermicro.
I can sell u mine D18 and jump to EVGA tbh. I could use some extra OC for my work π
Which OS do you plan on running?
I finished the system π
I put in the same HD that was running on my old server. I am running 64bit Ubuntu 10.04 (Lucid). Runs great!
Considering upgrading to 11.04, or switching to CentOS, but I really don’t have too much of a reason to switch to CentOS.