It’s just not worth buying computers at the moment.

In our present market, two turbulent forces entwine to null the value of anything less than monumental improvements. These are Brexit and memory price gouging. To qualify the clickbaity title, specifically I mean for technical computing in the UK.

Lies, damn lies, and marketing

In an earlier post I described that although an interesting technical observation, the idea of doubling in actual performance has been falsely perpetuated by marketing types. 10-20% improvement is more realistic however incremental improvements have at least been improvements. And a new machine has been a worthwhile investment over renewing maintenance on an old machine. Plus we like shiny new machines…

But there are many ways of measuring performance, and for many workloads even a 10% generational improvement is a falsehood.

SPECfp

To test my title hypothesis, consider SPECfp. This is a computer benchmark designed to test the floating point performance, however it differs from Linpack in that it more accurately represents Scientific / Technical computing applications. These tend to be very data-orientated and often push entire system bandwidths to the limits moving data on and off the CPUs

I collected data published on spec.org and using R extracted a comparable set of statistics for generations of Intel Xeon E5 CPUs, grouped them by their E5 sku number to compare generation-on-generation performance. Anyone who has benchmarked AMD for performance applications will naturally know why I’m only looking at Intel…

The below charts depicts those E5 numbers which occur in all four of the most recent generations being considered. There was an un-even time gap between generations, for further comparison I have also plotted this data against. A trend of decreasing improvements can be seen.

OK yes I’ve exaggerating, but not my much

In fairness to Intel, using these skus as the comparison point is a bit of a simplification. They will say we shouldn’t compare based on their E5 label, but whichever way we look at it the same patterns emerge. The below chart takes the highest performing CPU of each generation (including CPUs not represented on the earlier chart) and plots them against time.

I’ve taken the opportunity to join in the debunking of the “doubling in performance every two years” myth here by also plotting where that doubling in performance would have led to. I give this chart the alternative title: “Where marketing think computer performance has been going compared to where it has actually been going”.

Where marketing think computer performance has been going compared to where it has actually been going

The below chart plots the performance of all E5-2600 CPUs including those which do not occur in all generations for a fuller comparison agnostic to the names these products are given. Again the diminishing returns are apparent.

all-e5-perfs-against-time

The Intimidating Shadow of Ivy Bridge

Returning to my hypothesis, specifically I’d like to zone in on the Ivy Bridge (v2) CPUs launched in the tail end of 2013. Initially priced at a premium, however as prices settled into 2014 many more were bought. Machines sold with 3-years maintenance are pretty standard in IT, and so a significant number of machines up for maintenance renewals or replacement are Ivy Bridge.

Comparing the highest SKU of each generation, we see only 12% increase in real performance Ivy Bridge to Broadwell. Comparing SKUs over generations we typically see around 22% improvement.

This is most worrying as with current memory prices and currency exchanges servers typically cost 20-25% more than they were 8-9 months ago.

Memory cost did decrease per GB from Ivy Bridge until recently, plus we now have DDR4 and SSDs are more sensible. But if you have a higher end Ivy Bridge server falling off warranty, it’s just not replacing it right now. Buy maintenance instead and hope Skylake is better.

Advertisements

The Compute Landscape at the Beginning of 2017

For years the IT industry has accepted Intel as the only viable option. At the beginning of Intel’s reign the consensus was: “yeah Intel CPUs are way better than anyone else, let’s buy lots”. But now the feeling is: “oh, another incremental upgrade from Intel. What’s AMD up to? Ah still nothing. Fine buy more Intel…

Being fair it is mean of me to wail on Intel for AMD’s failure/refusal to compete, Intel have still been innovating just not at the rate we became accustomed to in the competitive years. 2017 looks to be an interesting year, a year we all get more choice.

Beyond Kidz wiv Graphics Cardz

NVIDIA have been pushing really hard for years now to establish themselves beyond gaming. Their GPU hardware offers excellent performance but despite creating a whole CUDA ecosystem to support their products, few made the leap. Incrementally faster horses were fine and we could all get on with our work.

Deep/Machine Learning is beginning to revolutionise IT. It’s stretching out beyond academia into more and more commercial uses. Soon if you do not have an analytics strategy you will not be competitive. This is an excellent area to use GPU accelerators; many machine learning applications involve a larger number of parallel computations proportional to the amount of data. And “big data” applications exploit scale-out designs beautifully.

Intel position their Phi co-processors (and lately Knight’s Landing processors) as a competitor to NVIDIA GPUs, but without significant direction no one really knows what to do with a large number of inferior Xeon cores in one box. Our E5 Xeons are often not at 100% utilisation, there’s little benefit moving to a platform with less memory per core, and less network bandwidth per core.

After years of unchallenged Intel dominance they are emitting the field of dreams aura of “If we build it, they will come”. This works for Xeon E5 chips as no one’s building anything else. But with NVIDIA building and aggressively supporting users move to their platform, accelerator users are flocking to NVIDIA leaving Phi and Knight’s Landing dead on the side of the road.

Are AMD about to ante up?

You’d think that as Intel have been cramming more and more cores into a box then AMD should have been quite competitive, until recently AMD were exceeding Intel in this metric. But their architecture is such that two “cores” share an ALU. This makes it not too dis-similar to Intel’s Hyperthreading where two virtual cores also time-share a physical core. Both get good utilisation out of their ALUs, but in most fair comparison Intel outperforms AMD.

AMD have been viewed as a cheaper “also-ran”. With the major exception of cloud providers, most of the industry has been moving to do more from less hardware. And even many cloud providers are using Intel (often E3s stacked high and sold cheap).

Intel have been coasting. The time is right for AMD to get back in the game. PCIe Gen 4.0 along with a refreshed nano-architecture could offer great potential for high-bandwidth applications.  Bandwidth between CPUS and accelerators, memory and the network.

Choice is Good

I’m speculating somewhat on AMD’s next platform and weather it will be any good, but NVIDIA certainly are well placed for 2017. The announcement of their Pascal architecture last year was a game changer for accelerators of which we are still feeling excitement. And IBM’s opening of their historically proprietary POWER platform into the OpenPower foundation opens the gates for more competitive POWER systems to break through.

I see more going on in compute now than there has been for years.

Hacking Tennis for luls and profit

As with many tech nerds, although employed in a specific area of IT I like to dabble in others in my free time. My most recent dabbling has been in data science. Although I say “science” I’m afraid my intentions are less noble than the word implies. I’m more interested in exploiting data for profit.

Odds of that?

Were I a bookmaker setting odds I could simply guestimate the probability of an outcome, knock a bit off for my “fee”, and offer those odds to my pundits. But where’s the profit if no one backs the looser?

The bookies have an awful lot of information at their disposal that they can use to balance a book. For example they know which teams / sports stars are popular with punters and will have a reasonable idea of how many bets they can expect when they offer any given odds. Were I setting odds I would be more interested in predicting how many people will take my odds and for what stakes than the messier business of predicting the outcome of a sporting event.

My goal as a book maker would be to make as much money as possible as reliably as possible. I would not be at all interested in “gambling”. I suspect larger bookmakers already do this, which would put an interesting inefficiency in the market ripe for exploiting in that odds are representative of the punter’s expectation of the outcome and not the probability of the outcome.

Why Tennis?

I like tennis. Well I don’t watch tennis, but if I were to I think I’d like it. Tennis is an ideal candidate sport for odds profiteering for a number of reasons:

  1. Singles tennis is a simple competition between two players without group dynamics and summing of component parts to account for
  2. It’s enjoyed by many for the sport itself, meaning a wide range of data is publicly available for fans enjoyment unlike horse racing where useful data is behind a pay-wall
  3. Underdogs win fairly regularly. In 2016, nearly 28% of matches were won by the underdog[1]

I see predicting which underdogs win as a good area to make money. I theorise there are unsupported, relatively unknown players that few pundits want to back. Bookies will incentivise with higher paying odds on these players to balance their book and remove the gambling element.

I have been exploring this area with machine learning algorithms with promising results.

First Pass

As a proof of concept I used datasets from tennis-data.co.uk and simulated predicting the 2016 season. I used an out-of-band validation technique where for a given day only data from previous days were considered to train the model, and the model was then used to predict that day. In my implementation training the model was the bottleneck, to shorten runtime I tested three days at once meaning the second and third tested days would be using an “outdated” model. I was careful to avoid leakage and deemed this an acceptable compromise as it could only make results worse[2]

I implemented some very simple features based on the data easily available, this was mostly game win percentage per set, and comparisons with competitor and used this to train a predictive model in R to calculate a rough probability of the underdog winning using only data that would have been available before each match.

This probability is combined with the betting odds to calculate a theoretical “average” return[3] for backing the underdog based on my assigned probability.

The Results

My results were very promising indeed. If you back every underdog you loose, some come in but not enough to recoup other lost stakes. But if you were to back every underdog my model estimates to have a theoretical return greater than 1.0 then you would make a profit.

The plot below illustrates the profit made and the number of bets made based on setting the threshold in different places.

tennisprofit01

The trick to maximising profit is deciding where to set the threshold for which underdogs you back. This is a conundrum as it is very dangerous to set the threshold for a predictive model with data after the fact.

My biggest criticism of the results is the small number of bets worth making were found. Setting the threshold at 1.5 results in only 200 matches identified as worth betting on across the whole year, and only 36 of these come in. The odds were high enough to recoup losses but these small quantities seem too much like “gambling” and vulnerable to fluctuation. With the limitation of only one reality to test outcomes  it is unfortunately impossible to know if this is the good or bad end of possible outcomes.

What next?

I am pleased with the direction of my results but do not believe them conclusive enough to put this into production. I only used a small number of “features” to train my model and believe there to be more valuable mining that can be done here.

The major bottleneck in my experiments was the time it took my computer to train the model in R. The winter holidays has been a good time for me to do this, not only have I had time off work to write my code but also time with family away from my computer allowing it to work whilst I don’t.

To make real progress I need more throughput. I do have experience in c++ but limited access to good machine learning algorithm implementations in it. Learning Spark seems like a good way forward, benchmarks I’ve seen place it way better than R and it’s scale out parallel design would allow me to add more cheap hardware if I see more good results.

Plus I may be looking for a new job in Data Science / Big Data in the near future and Spark is the feather to have in your cap right now.

 

Footnotes:

[1] by Bet365’s odds, 734 of 2626 recorded matches (three were excluded for not having odds available).

[2] I’d argue “could” should be read as “should” if this were written by someone else.

[3] Warning, don’t discuss philosophy with a computer guy: A theoretical average where the same match is played a number of times simultaneously in which different results are possible. Assumes “fate” isn’t a thing but also that instances are finite.