Bring complex ML models to production with serverless • The Register


Special series An old truism of machine learning is that the more complex and larger the model, the more accurate the result of its predictions – up to a point.

When you study ML disciplines like natural language processing, it is the massive BERT and GPT models that swoon practitioners when it comes to precision.

When it comes to putting these models in production, however, the buzz fades as their sheer size makes using them quite a struggle. Not to mention the cost of building and maintaining the infrastructure that goes from research to production.

If you read this, avid IT trendists may remember the advent of serverless computing a few years ago.

The approach promised fairly large computing capacities that could be automatically scaled up and down to meet changing requirements and keep costs down. It was also possible to relieve teams of the burden of maintaining their infrastructure, as this was mostly done in the form of managed offers.

Well, serverless hasn’t gone away since then and at first glance it seems like an almost ideal solution. However, if you dig deeper, constraints on things like memory usage and the size of the provisioning package get in the way of making it a straightforward option. However, interest in the combination of serverless and machine learning is growing. And with it the number of people working on ways to build BERT models and adjust provider specifications to facilitate serverless deployments.

Find out how you can build trust in your AI apps in our MCubed web presentation this week


To learn more about these developments, we would like to welcome Marek Šuppa to the fourth installment of our MCubed web series of lectures for machine learning practitioners on December 2nd. Šuppa is Head of Data at Q&A and survey app maker Slido, where he and a few colleagues worked last year to investigate ways to modify sentiment analysis and classification models for use in serverless environments can – without the dreaded loss of performance.

In his talk, Šuppa will talk a little about his team’s use case, the things they considered serverless, the problems they faced during their studies, and the approaches they felt were most promising, about latency achieve their deployments that are suitable for production environments.

As usual it is Webcast on December 2nd starts at 1100 GMT (1200 CET) with a recap of machine learning news related to software development that will give you a few minutes to settle in before we dive into model deployment in serverless environments.

We look forward to seeing you there; we will even send you a short reminder on the day, just register exactly here. ®

Source link


About Author

Comments are closed.