Cloud Computing's Next Wave with Liftr Insights' Paul Teich

Cloud Computing’s Next Wave with Liftr Insights’ Paul Teich

The cloud computing movement has provided incredible investment returns in recent years. But the industry is changing quickly, and leaders like NVIDIA are at risk of losing share to customized chips that now serve specific applications. Liftr Insights' Paul Teich offers an incredible perspective on what is going on in the cloud and new tasks like inference processing will mean for its future. This is an absolute must-watch (or, if you prefer the transcript, a "must-read") for anyone actively investing in the cloud!

March 26, 2020 – By Simon Erickson

Cloud computing has been an exciting trend for the investing world.

The collective investment performance of cloud computing companies has simply crushed the broader market in recent years. This has largely been due to a flood of demand, as companies have looked to the cloud to more effectively deploy software or to manage their operations more cost-efficiently.

But this is also an industry that changes quickly and has a ton of technical complexity. A new vernacular of acronyms – such as “SaaS”, “IaaS”, “PaaS”, and “FaaS” – are taking the tech world by storm, as the Cloud Titans are doing their best to provide “as a service” resources to their customers. And a new focus on artificial intelligence software means those customers are becoming even more demanding.

So how can investors make sense of all the changes? To help with that, 7investing has called on a cloud computing expert.

[su_button url="/subscribe/" style="flat" background="#84c136" color="#ffffff" size="6" center="yes" radius="0" icon="" icon_color="#ffffff" desc="Get full access to our 7 best ideas in the stock market for only $49 a month."]Sign Up Today! [/su_button]

Paul Teich is a principal analyst for Liftr Insights. With nearly 40 years of experience in the IT industry and with 12 patents to his name, he has a detailed understanding of what’s going on in the cloud and the overall direction the industry is heading.

In Part 1 of our recent interview, Paul describes the important trends taking shape in cloud computing today. He explains how cloud service providers are differentiating from one another and why customized hardware is becoming so important for larger companies.

Paul also provides a real-life example: explaining why Amazon developed its “Inferentia” chips in order to power its Alexa smart speakers.

Interview timestamps:

0:00 – Introduction
0:24 – Understanding the “as a service” acronyms and the bigger-picture trends
2:37 – Pricing models and how the increasing demands of AI are leading to a new wave of software
9:08 – How the Cloud Titans are differentiating from one another
13:37 – How “Public Cloud Transparency” defines what new chips are currently being deployed
15:36 – What Inference Processing could mean for Intel and NVIDIA
17:52 – An Example: “Inferentia” for Amazon Alexa

Complete Transcript

[00:00:00] Simon Erickson – Hi everyone! 7investing founder Simon Erickson here. We’re talking about cloud computing this morning.

This is a highly complex and technical industry. So I’m very thankful and honored to be joined by Paul Teich, Principal Analyst at Liftr Insights in Austin, Texas. He’s also an expert in this industry, one of the smartest people I’ve ever talked to about for cloud computing. Paul, thanks so much for joining me here this morning.

[00:00:23] Paul Teich – My pleasure. Simon, great to be here!

[00:00:25] Simon Erickson – Paul, like I said, cloud computing is highly complex and there’s a lot of buzzwords going on right now. But the 10,000 foot level, a couple of fundamental changes that you see taking place in the cloud?

[00:00:36] Paul Teich – So actually backing up to maybe a view from geosynchronous orbit (a really technical reference!), we hear a lot about serverless. And we hear a lot about software as a service.

And essentially, there are four categories of cloud computing today. “Software as a service” which essentially is a data migration. You’re giving up your app to use somebody else’s app. That app is based in the cloud. And you’re paying for the app usage and that’s where you get your efficiency.

Underneath that are “platform as a service”. And so platform as a service enables your developers to kind of string together Amazon’s database or Google’s A.I. or Microsoft’s apps, right? And an active directory, good example. So it lets you use bits of other peoples’ services to kind of paste together an app without actually having to do all that programming development yourself.

Underneath platform as a service is “infrastructure is a service” as “IaaS”. And you’ve got this emerging “functions as a service” (“FaaS”). FaaS is what we refer to as serverless-computing. OK. Now, that’s kind of a misnomer. There are servers in serverless computing.

But it’s more the difference is that IaaS is traditional programming. So you use your traditional developer environments. You load your operating system or use an operating system available to you and in an infrastructure as a service instance. But everything else is up to you. So you’re writing the app, you’re using Python or whatever language you’re choosing and treating it just like your own server.

But it’s running somewhere else and that’s where you get your efficiencies. As somebody else is maintaining the server, they’re powering it more efficiently, cooling it more efficiently. And that’s what cloud is. You pay as you go.

[00:02:37] Pricing is, folks call it various things, “on demand” or “pay as you go” as the baseline pricing. And then you can do spot pricing or reserved instances to try and get better than the standard pricing.

The programming difference in functions as a services that you actually have to use the logic provided by the functions as a service. And it’s not traditional programming. There are some limitations in terms of the complexity of the logic you can use. It takes retraining your programming staff to how to use functions of service. And so, while we hear a lot of new development is considering functions as a service.

Back up another step: if you’re doing a “lift and shift” and you have a program that runs in a virtualized environment like VMware (which pretty much everybody does), your options are to do just a complete lift and shift: you run VMware on your on-prem infrastructure. You would go rent an IaaS instance (infrastructures of service) and just use VMware in the infrastructure as a service instance to migrate your app from your on-prem infrastructure or your hosted managed infrastructure into a cloud.

To go to functions as a service, you have to refactor your application. And to go to platform as a service, you have to refactor your application. So it takes a bit of knowledge as you’re migrating your apps after that first step.

Now the gotcha with VMware is that a lot of the benefit you get from renting cloud instances is by right-sizing the instance. And so you can get two cores, four cores, eight cores…a certain memory per core profile, SSD or not SSD, fast network connection back to your S3 storage buckets. You know, there’s lots of options to tune what you’re paying for in the cloud.

However, if you do a pure lift and shift from VMware, what you’re going to end up with is a big bare metal instance. And so, it’s kinda like trading hyperconverged infrastructure on-prem for hyperconverged infrastructure in the cloud.

And so what we hear kind of is, “Yeah, it’s great. I moved to the cloud. I’m on VMware and and I’m not seeing the cost efficiencies that I would be getting if I were to do some kind of refactoring.” OK, if I were to not even go cloud native. But just let’s say I re-compiled, if I’m on a compiled system or if I’m using some PaaS services on an IaaS platform, I can be really smart about which services I’m using. But if all I’m gonna do is a lift and shift, it’s not going to be as cost effective as I thought.

So that that’s kind of the basics of what’s where. With PaaS, SaaS, FaaS, you really don’t get much of a choice in terms of what you’re running on. So the actual cloud provider, they provide the hardware that your database runs on; if you’re using RDS on Amazon. OK. You have no idea what actual servers are running that storage system. And you don’t care. It’s platform as a service. They’re just providing it at a price with a service level agreement.

With IaaS, you have more control. And what we’re seeing is while some people jumped into cloud, companies decided to go all in on cloud, they’re doing some repatriation now. As they discover they’re not getting the cost efficiencies that were there were supposed to get in some areas. We’re finding kind of that same thing with functions as a service. As developers like rushed over to do functions as a service. They discovered there were some things they couldn’t do. There were there were some limitations.

We’re cloud based. We’re a DevOps group. We run in AWS. And we were using…we tried to go all in on functions as a service on Lambda. And it didn’t work that well. OK, for us there were some limits. And so we have part of it running functions as a service and part of it running as traditional instance types in IaaS.

I’ll leave it there. We can go any direction.

Simon Erickson [00:07:05] – So it sounds like at the basic level though, everything’s “as a service” now. The goal is to provide those resources and that infrastructure, so people can focus more on developing their apps. And then whether it’s Amazon, Google, Microsoft, or anyone else: they’re going to take care of the behind the scenes work, so they can focus on the things they really want to focus on.

Paul Teich [00:07:26] – With the exception of IaaS.

So that whole focus on behind the scenes, you go one step lower. As you move down the abstraction layer.

So if I just want somebody to replace my payroll system. Done. I don’t need to write another payroll system. Ever so many sources. That’s software as a service. And you can find analytics packages, that are software as a service.

But if you have a special need, if you’re doing something that a lot of companies aren’t doing, and that’s what we find with artificial intelligence and analytics, some of these streaming applications don’t…the state of the industry there hasn’t evolved enough to have a package. An off the shelf package that just what a company needs.

That’s when you back up and you start looking at at functions, at platforms, and then it’s infrastructure as a service. And if you really need to the control, infrastructure as a service is a great way to go.

What we’re seeing is, despite all the hype about moving away from infrastructure as a service to platform as a service, to functions as a service, it’s grown. The top four clouds, what we measure, have grown in complexity by over 40 percent since we started measuring about a year ago.

And by that, I mean, the regional growth — the new regions that Google, Microsoft, Amazon and Alibaba have lit up for their clouds to new geographies they’ve enabled and new instance types and new sizes of types — so that proliferation and complexity, we see a 40 percent growth in less than a year.

We’re frankly pretty stunned.

[00:09:08] Simon Erickson – And Paul, second question for you. With those large cloud providers you just mentioned that are growing very quickly: are the offerings that they are giving out there, are those becoming somewhat commoditized? Or are there certain things that they’re doing to differentiate? What does Amazon have that Google doesn’t have? That Microsoft doesn’t have? That Alibaba doesn’t have?

How are they standing apart from one another, as a cloud service provider?

[00:09:32] Paul Teich – So first level of complexity is memory per core. So large memory spaces. We’re seeing a trend toward more cores. There’s a couple of different axes here. So one is processor choice. You’ve got Intel versus AMD. Now increasingly, ARM is starting to be a choice. So Amazon has shipped its Graviton II second-generation ARM-based processor.

But really, the choice is memory per core. The processors have fairly similar performance, we’re gonna be digging into that in the future. But if you assume there’s some fixed deltas between ARM cores, Intel cores, AMD cores. The next thing is “do I need 1, 2, 4, 8, 16 gigabytes per core?”

And so the core count interplay with “I need terabytes of memory” – if I’m if I’m going to, for instance, I need higher than eight gig per core – I need to go to Microsoft pretty much. Microsoft is differentiated in supplying very large memory per core configuration points. So memory per core.

We’re starting to see some speed differentiation with Cascade Lake. So we saw everything kind of stalled under three gigahertz per core, in terms of clock frequency, and so that’s how they’re differentiating. The other one’s SmartNICs.

So as the clouds offload the hypervisor on to the SmartNIC…kind of a sidecar processor. So there’s this whole evolution of network processing, offloading the network stack, trying to get as much application performance out of your server as possible.

And the extreme of that is that now, like AWS Nitro and Microsoft’s Catapult cards (and I’m sure Google has one; they haven’t talked about it yet. And actually Alibaba has what they call Dragon X), so they all have these these sidecar processors on the NIC, which now contain processors of their own. So they can go run the hypervisor stack, the root hypervisor off of the main server processor and give the application all of the performance it needs for its host operating system, if that’s a choice. Or the container. And your application.

So you have a bunch of different variables. The big ones I think are in the next six months are probably going to be memory space/size (memory per core), processor speed (with Cascade Lake coming on. We’ll see if AMD – which was at parity with with Intel, although core performance appears to be very good and memory per core seems to be really good – AMD was kind of stuck at that same kind of frequency as Skylake. And so we’ll see if they can boost frequencies with Intel, Xeon, Cascade Lake. Does that answer?

[00:12:47] Simon Erickson – Absolutely it does.

[00:12:51] Paul Teich – And pricing is also a bit differentiated. So they each have slightly different pricing strategies. They’re not going head to head, one to one.

We’ve actually done some interesting box-and-whisker chart analysis. When you put hundreds of different instance types with eight cores together in different memory profiles. And you start to kind of tease apart where were their pricing strategy is.

Google is a very simple pricing strategy. They’re aimed at developers, mostly, historically. They’re trying to go to enterprise. But I think the other three offer a wider variety of pricing. I guess to appeal better to enterprises who want to fine tune their cost optimization.

[00:13:37] Simon Erickson – Let’s dig deeper into that a little bit. Because I think one of the interesting things that you’re doing at Liftr is what you’re describing as “public cloud transparency”. Right? You’re able to see the hardware configurations that are running all those instances all over the world.

And so to your point about Google is slightly different than Amazon, is slightly different Microsoft and anything else out there.

What can you tell us that you’re finding out there? I mean, what are the overall takeaways that you’re seeing from what’s getting deployed? This used to be very proprietary, right? What kind of hardware people were using. But what are you starting to notice out there?

[00:14:15] Paul Teich – I think the biggest competitive shift is that the clouds are starting to derive value out of advertising that they’re using alternative CPU chips, processor chips.

So it used to be AMD was (and AWS was one of the first), they said we’ll offer it at a 10% discount, see if anything anybody bites. It was a value play. But as Google gets into it, as Microsoft gets into it. It starts to become a more nuanced, competitive arena. Where I think customers will start paying less attention to which processor choice they’re buying.

On the GPU front. It’s funny, because essentially Intel’s got this one big competitor [AMD]. NVIDIA doesn’t have that particular market challenge. But they have a lot of small competitors. And they add up to kind of the distinct share that AMD is taking in the processor space. What we’re seeing is the focus has shifted from there’s still a bunch of virtual desktops, virtual workspaces, workstations, stuff like that happening in one portion of the enterprise cloud market. And we just saw AMD move into that space with its new Radeon Instinct.

[00:15:36] But the real action, I think, is going to happen in inference processing.

So just a backup a second. In the A.I. world, in the deep learning machine learning world, we divide up the training task from the inferencing task.

So when you’re training a model, you’re trying to discover what works. How to recognize a cat or a hot dog. [Laughs]. These trivial examples. And you try to look for these things are looking for know, combing through video for facial surveillance. And things like that. They’re very sophisticated tasks.

Training a system to recognize something, whether it’s text or video images or multi-dimensional patterns and other data. Right. It’s essentially a supercomputing task. And NVIDIA still kind of owns that market. Google invented their tensor processing unit (TPU) as a way to try to offer an alternative. And I think that didn’t work very as well as they would like. It’s hard to expose that programming to developers.

So they only expose TPUs through Tensorflow. Recently, they have opened that up a little bit. I think PyTorch was in development. And may have actually shipped…I need to go brush up on that.

So that means only two programming languages can use it. A Google Cloud TPU. Whereas multitude of anything you want to kind of throw out an NVIDIA GPU, they’ve developed a driver for it. That’s kind of the challenge AMD has in the market. It’s also the challenge that Xilinx and Intel have in the FPGA world, is it’s much more difficult to program these because you have to do a lot more work to get your code to run on it with the rest of the code you’re writing in the cloud. Whereas NVIDIA’s done a good job with making sure that the integration is fairly seamless.

So we’re seeing the movement away from general purpose GPUs – seems really strange to say that (so graphics processing units; what NVIDIA sells) toward having deep learning specific cores. They have a tensor core in their latest generations.

[00:17:52] So the V-100 – the Volta generation from NVIDIA and also the T4; the new Tesla thing – have these tensor cores in it that are specifically designed to accelerate deep learning tasks. Now the V-100 is aimed at training tasks. The T4 is aimed at inferencing tasks. Which is where AWS has introduced their Inferentia. So it’s an in-house design.

Backing up a bit, I mentioned Graviton. OK, so AWS has this thing you’ve probably heard of called Alexa, OK? And what they’ve done and they’re public about this is Alexa’s deployed at such scale. We have millions, tens of millions of people using the smart speaker services, right? There’s many levels of deep learning associated with that.

So first level is speech recognition. What am I saying? What are the words I’m saying? Second level is natural language processing, which says, all right, so what’s the intent behind what I’m saying? If I ask for a sports score, what team am I asking for, is it something I ask for habitually, things like that.

Then there’s the whole context behind: Now that you know what I’ve said, you understand the intent behind what I’ve said. Now you have to go find the information. It’s a multiple levels of deep learning models to go handle Alexa.

So what they did was they invented…they designed their own processor to Graviton. Now the Graviton II, and they designed their own inferencing accelerator called Inferentia to go power, specifically Alexa tasks.

And so they decided (well, this is something we see from all the clouds, by the way), when they deploy something at scale internally to serve an internal software as a service function (and Alexa falls directly into that category) they start buying in such large quantities that it becomes not just an option but an imperative to expose that capability to the public cloud. Right.

So even though they deployed first with Alexa, they worked all the bugs out of it, and then they offer Inferentia now as an alternative to programmers who wanted to deploy inferencing at scale for their apps.

The challenge there is they had to go the same route as Google went. And you have a limited selection of programming languages and model types that you can run Inferentia on today. Though, like Google, that will probably get wider as we go through time.

But because it’s not a general market thing. Because it was specific to a certain set of deep learning tasks, AWS’ challenge is to give it broader appeal.

But the decision to “Hey, let’s push this into the public cloud in some regions” is really a no-brainer. They’re buying it at such scale. And if they’re seeing the cost efficiencies – which for Alexa translates into precise inferences per second per watt. So it’s a complex interplay of “Am I doing the thing accurately and fast”, you know, hitting a certain user experience. I understand which team I’m talking about (because it’s baseball season and it’s the team I always ask about when it’s baseball season, right? Which is different than the team I ask about during basketball season). But it knows these things, right. It knows what kind of context sensitive.

And it’s doing that at an efficiency that my data center can serve a lot of these requests per megawatt.

So that that’s kind of the scale at which we’re operating now. And it’s kind of what a cloud applications developer and then the folks who manage that application are trying to take advantage of.

[00:22:08] Simon Erickson – Fair enough. So if I can summarize that – overgeneralize this, perhaps – it’s purpose driven for the companies that are developing it. But then it’s still an operations per second cost efficiency for everybody else that’s using those on the public cloud.

[00:22:23] Paul Teich – Absolutely.

Recent Episodes

Long-Term Investing Ideas in a Volatile Market

Simon recently spoke with a $35 billion global asset manager about how they're navigating the market volatility. The key takeaways are to think long term, tune out the noise...

Dec 08, 2022 – By Simon Erickson

Wreck or Rebound – Part 3! With Anirban Mahanti, Matt Cochrane...

Anirban and Matthew were joined by Alex Morris, creator of the TSOH Investment Research Service, to look at seven former market darlings that have taken severe dives from...

Dec 01, 2022 – By Simon Erickson

No Limit with Krzysztof and Luke – Episode 5

On episode 5 of No Limit, Krzysztof won’t let politics stand in the way of a good discussion - among many other topics!

Nov 29, 2022 – By Simon Erickson