/ serverless

Handling large dependencies in your AWS python lambdas with serverless

TL;DR Reduce the size of your AWS python lambdas by over 30% and save precious seconds during cold-starts using the new slim option of the serverless-python-requirements plugin with the serverless framework.


If you have been using AWS lambdas to build your services, you will most likely have run into the limits enforced by AWS for lambdas.
One of these limits is the size of the deployment package, a simple zip file, which is limited to 50 MB. This can be extended to well over 100 MB if you are deploying your lambdas using a CloudFormation stack, but eventually you will run into another hard limit: the extracted files cannot exceed 250 MB.

Lately, EUROPACE has been building an AI system, which heavily relies on many of the excellent python machine learning packages like sklearn, numpy and scipy, to name a few. During this endeavor, we hit the 250 MB hard limit and were forced to look for solutions. Luckily we were already using the serverless framework in conjunction with the serverless-python-requirements plugin, which allowed us to use a zip-in-zip approach to deploy our requirements. The external dependencies are zipped and then that artifact is zipped again along with your source code. The zipped dependencies can be extracted at runtime. This meant that at the time of deployment the 250 MB limit would be met, but extracting the requirements at runtime meant excruciating cold-start times of over 30 seconds.

Most python packages ship with large amounts of tests code (~10% total size) which most often is not needed in a production environment. Behold, the newly merged PR to the serverless-python-requirements plugin by dee-me-tree-or-love This adds a couple of new options to the plugin, giving you additional control over the deployable size of your artifact.

serverless-python-requirements plugin in action

The plugin has already supported zipping the requirements for quite some time. However, having to unpack at runtime resulted in very long-cold starts, especially when there was a lot to unpack.
With the new slim option, there is no need to unpack at runtime. Unnecessary "package-waste" (i.e. tests, pycache, distinfo and inessential information in binary *.so dependencies) is not packaged with your code and instead discarded.

The option is enabled like this

custom:
  pythonRequirements:
    slim: true

and you can optionally add another parameter slimPatterns, where you can pass a list of patterns you want to ignore during packaging (think datasets that come shipped with some data science packages).

Enabling slim can save a lot of space in your deployment, as can be seen below for a very simple machine-learning lambda (<30 lines of code) having "just" sklearn, numpy and scipy as dependencies.

The slim option of the plugin (r.) compared to the standard packaging (l.)

Enabling the slim option saves over 20 MB on this artifact, which is a >33% saving! Not bad!

Unfortunately, the option is not yet included in the current release (4.0.5 at the time of writing) of the serverless-python-requirements plugin, which was released just before the PR was merged. Thus it will not be installed by npm automatically. If you want to use the power of this new feature you have to install the plugin via GitHub:

npm install --save-dev git://github.com/UnitedIncome/serverless-python-requirements.git#master

Thanks for looking by and stay tuned for a post of how the different options compare in terms of cold-start behavior.

Update 2018-06-18 v4.1.0 of the serverless-python-requirements plugin is now live. Go ahead and install the plugin via npm. Thanks dschep.