It’s just not worth buying computers at the moment.

In our present market, two turbulent forces entwine to null the value of anything less than monumental improvements. These are Brexit and memory price gouging. To qualify the clickbaity title, specifically I mean for technical computing in the UK.

Lies, damn lies, and marketing

In an earlier post I described that although an interesting technical observation, the idea of doubling in actual performance has been falsely perpetuated by marketing types. 10-20% improvement is more realistic however incremental improvements have at least been improvements. And a new machine has been a worthwhile investment over renewing maintenance on an old machine. Plus we like shiny new machines…

But there are many ways of measuring performance, and for many workloads even a 10% generational improvement is a falsehood.

SPECfp

To test my title hypothesis, consider SPECfp. This is a computer benchmark designed to test the floating point performance, however it differs from Linpack in that it more accurately represents Scientific / Technical computing applications. These tend to be very data-orientated and often push entire system bandwidths to the limits moving data on and off the CPUs

I collected data published on spec.org and using R extracted a comparable set of statistics for generations of Intel Xeon E5 CPUs, grouped them by their E5 sku number to compare generation-on-generation performance. Anyone who has benchmarked AMD for performance applications will naturally know why I’m only looking at Intel…

The below charts depicts those E5 numbers which occur in all four of the most recent generations being considered. There was an un-even time gap between generations, for further comparison I have also plotted this data against. A trend of decreasing improvements can be seen.

OK yes I’ve exaggerating, but not my much

In fairness to Intel, using these skus as the comparison point is a bit of a simplification. They will say we shouldn’t compare based on their E5 label, but whichever way we look at it the same patterns emerge. The below chart takes the highest performing CPU of each generation (including CPUs not represented on the earlier chart) and plots them against time.

I’ve taken the opportunity to join in the debunking of the “doubling in performance every two years” myth here by also plotting where that doubling in performance would have led to. I give this chart the alternative title: “Where marketing think computer performance has been going compared to where it has actually been going”.

Where marketing think computer performance has been going compared to where it has actually been going

The below chart plots the performance of all E5-2600 CPUs including those which do not occur in all generations for a fuller comparison agnostic to the names these products are given. Again the diminishing returns are apparent.

all-e5-perfs-against-time

The Intimidating Shadow of Ivy Bridge

Returning to my hypothesis, specifically I’d like to zone in on the Ivy Bridge (v2) CPUs launched in the tail end of 2013. Initially priced at a premium, however as prices settled into 2014 many more were bought. Machines sold with 3-years maintenance are pretty standard in IT, and so a significant number of machines up for maintenance renewals or replacement are Ivy Bridge.

Comparing the highest SKU of each generation, we see only 12% increase in real performance Ivy Bridge to Broadwell. Comparing SKUs over generations we typically see around 22% improvement.

This is most worrying as with current memory prices and currency exchanges servers typically cost 20-25% more than they were 8-9 months ago.

Memory cost did decrease per GB from Ivy Bridge until recently, plus we now have DDR4 and SSDs are more sensible. But if you have a higher end Ivy Bridge server falling off warranty, it’s just not replacing it right now. Buy maintenance instead and hope Skylake is better.

time to take Java seriously again?

Like many Computer Science graduates Java was the first language I’d say I really learnt. Sure I’d dabbled in C and VB but Java is where I first wrote meaningful code beyond examples from the text book. Again like many Computer Science graduates, I turned my back on Java pretty soon after that.

The need is not to get the most out of your hardware but to get the most out of your data, as quickly and continuously as possible to retain your advantage.

My experience in video game programming as well as my current day job around research computing (although not in a programming capacity) both feature squeezing every drop out of hardware which sadly leaves little space for Java. In both code written in fast low-level languages is optimised to exploit the hardware it will run on.

remove-c-give-java

The ongoing data analytics and machine learning revolution, surely the most exciting area in IT at the moment, is bringing with it a data-centric approach of which we should all take note. The need is not to get the most out of your hardware but to get the most out of your data, as quickly and continuously as possible to retain your advantage.

Spark for example is written in Scala, which compiles into Java byte code to run on the Java Virtual Machine which itself finally runs on the hardware. Furthermore many Spark apps are themselves written in a different language such a R or Python which have to first interface with Spark. This is a lot of layers of abstraction each adding overheads which would be shunned by performant orientated programmers.

kill-java

Yet when I look at these stacks I instead see wonderful things being done and begin to see past my preconceptions.

I’m also seeing containers grow in prominence which are a natural fit for Java development. With S2I builds (source to image) developers can seamlessly inject their code from their git repository into a Docker image and deploy that straight onto a managed system.

Whilst C++ will remain the norm for mature performant orientated applications, hypothesis testing and prototyping to yield quick results is giving an extra life to Java.