AWS Lambda Cheat Sheet

A serverless compute service.
Lambda executes your code only when needed and scales automatically.
Lambda functions are stateless – no affinity to the underlying infrastructure.
You choose the amount of memory you want to allocate to your functions and AWS Lambda allocates proportional CPU power, network bandwidth, and disk I/O.
AWS Lambda is SOC, HIPAA, PCI, ISO compliant.
Natively supports the following languages:
- Node.js
- Java
- C#
- Go
- Python
- Ruby
- PowerShell
You can also provide your own custom runtime.

Components of a Lambda Application

Function – a script or program that runs in Lambda. Lambda passes invocation events to your function. The function processes an event and returns a response.
Execution environment – a secure, isolated micro virtual machine where a Lambda function is executed.

Runtimes – Lambda runtimes allow functions in different languages to run in the same base execution environment. The runtime sits in-between the Lambda service and your function code, relaying invocation events, context information, and responses between the two.
Environment variables – key-value pairs that you can use to store configuration settings for your function. They can be used to pass dynamic parameters to your function at runtime, such as database connection strings, API keys, and other sensitive information.
Layers – Lambda layers are a distribution mechanism for libraries, custom runtimes, and other function dependencies. Layers let you manage your in-development function code independently from the unchanging code and resources that it uses.
Event source – an AWS service or a custom service that triggers your function and executes its logic.
Downstream resources – an AWS service that your Lambda function calls once it is triggered.
Log streams – While Lambda automatically monitors your function invocations and reports metrics to CloudWatch, you can annotate your function code with custom logging statements that allow you to analyze the execution flow and performance of your Lambda function.
AWS Serverless Application Model

Lambda Functions

You can upload your application code as a ZIP file or a container image hosted on Amazon Elastic Container Registry (Amazon ECR).
To create a Lambda function, you first package your code and dependencies in a deployment package. Then, you upload the deployment package to create your Lambda function.
After your Lambda function is in production, Lambda automatically monitors functions on your behalf, reporting metrics through Amazon CloudWatch.
Conﬁgure basic function settings, including the description, memory usage, storage (512MB – 10GB), execution timeout (15 minutes max), and the role that the function will use to execute your code.
Environment variables are always encrypted at rest and can be encrypted in transit as well.
Versions – a snapshot of your function’s state at a given time. When you publish a new version, a :version-number is appended to your function’s ARN:
- arn:aws:lambda:us-east-2:123456789123:function:my-function:1
Aliases – serves as a pointer to a Lambda function version. Aliases create a human-readable version of the function’s name, making it easier to remember and understand what the function does. An alias follows the following format:
- arn:aws:lambda:us-east-2:123456789123:function:my-function:MyAlias
A layer is a ZIP archive that contains libraries, a custom runtime, or other dependencies. Use layers to manage your function’s dependencies independently and keep your deployment package small.
You can configure a function to mount an Amazon EFS file system to a local directory. With Amazon EFS, your function code can access and modify shared resources securely and at high concurrency.

Invoking Lambda Functions

Lambda supports synchronous and asynchronous invocation of a Lambda function.
Synchronous invocation
- when a function is invoked synchronously, AWS Lambda waits until the function is done processing, then returns the result.
- examples of AWS services that invoke Lambda functions synchronously:
  - Amazon API Gateway Application Load Balancer
  - Amazon Cognito
  - Amazon Data Firehose
  - Amazon CloudFront (Lambda@Edge)
Asynchronous invocation
- when a function is invoked asynchronously, AWS Lambda stores the event in an internal queue and handles the invocation
- the Lambda function returns a 202 status code (Accepted) immediately after being invoked, and the processing continues in the background. The 202 code just confirms that the event is queued; it does not indicate whether the function runs successfully or not.
- typically used for long-latency processes that run in the background, such as batch operations, video encoding, and order processing.
- can only accept a payload of up to 256 KB.
- examples of AWS services that invoke Lambda functions asynchronously:
  - Amazon API Gateway (by specifying Event in the X-Amz-Invocation-Type request header of a non-proxy integration)
  - Amazon S3
  - Amazon CloudWatch Logs
  - Amazon EventBridge
  - AWS CodeCommit
  - AWS CloudFormation
  - AWS Config

Event Source Mapping

Event source mapping is a Lambda resource that reads from a queue or stream and synchronously invokes a Lambda function.
You can apply an event-filtering pattern to process events that are only relevant to your application. This allows you to save money by reducing the number of function invocations.
Event source mapping invokes a function if one of the following conditions is met:
- The batch size is reached
- The maximum batching window is reached
- The total payload is 6 MB
Lambda provides event source mappings for the following services.
- Amazon Kinesis
- Amazon DynamoDB
- Amazon Simple Queue Service
- Amazon MQ
- Amazon Managed Streaming for Apache Kafka (Amazon MSK)
- Self-managed Apache Kafka

Deploying Codes with External Dependencies

AWS Lambda includes a number of pre-built dependencies for specific runtimes. These dependencies can be used to run your code without having to include them in your deployment package.
If you’re using an external library/SDK/module in your Lambda code, do the following steps:
1. Place all external dependencies locally in your application’s folder.
2. Create a ZIP deployment package of your Lambda function.
3. Upload the deployment package to AWS Lambda. You can send the file directly to the AWS Lambda Console or store it first in Amazon S3 and deploy it from there.

Concurrency Management

Concurrency is the number of instances that serve requests at a given time. When your function is invoked, Lambda allocates an instance of it to process the event. When the function finishes running, it can handle another request. If the function is invoked again while a request is still being processed, another instance is allocated, which increases the function’s concurrency.
To ensure that a function can always reach a certain level of concurrency, you can configure the function with reserved concurrency. When a function has reserved concurrency, no other function can use that concurrency. Reserved concurrency also limits the maximum concurrency for the function.
To enable your function to scale without fluctuations in latency, use provisioned concurrency. By allocating provisioned concurrency before an increase in invocations, you can ensure that all requests are served by initialized instances with very low latency.

Lambda Function URL

With the function URL feature of the AWS Lambda service, you can launch a secure HTTPS endpoint dedicated to your custom Lambda function.
You don’t need an intermediary service such as Amazon API Gateway to directly invoke your function, which was required in the past. Just send an HTTP request to the unique URL of your Lambda function to get started.
Function URL endpoints are publicly accessible by default and have the following format:
- https://<url-id>.lambda-url.<region>.on.aws
A Lambda Function URL can be created and configured via the AWS Lambda console or through the Lambda API.
Upon creating a function URL, AWS Lambda automatically generates a unique URL endpoint for you that you can immediately use.
This URL endpoint is static and doesn’t change once created.
Lambda URLs are dual stack-enabled, which support both IPv4 and IPv6 protocols
The URL can be invoked via a web browser, CURL, Postman, or any HTTP client.
There are 2 authentication types for controlling access to a Lambda function URL:
- AWS_IAM – uses IAM to authenticate and authorize users. Only IAM users or roles that have been granted permission to invoke the function through IAM policies will be able to do so.
- NONE – allows anyone who has the function URL to execute the Lambda function whether they have an AWS account or not.
You can access your function URL through the public Internet only and not via AWS PrivateLink (e.g., VPC Endpoints)
Uses resource-based policies for security and access control. You can further secure your function URL by enabling cross-origin resource sharing (CORS) to whitelist origins permitted to invoke it.
A function URL can be applied to any Lambda function alias or to the LATEST unpublished function version but not to any other function version.

Configuring a Lambda Function to Access Resources in a VPC

In AWS Lambda, you can set up your function to establish a connection to your virtual private cloud (VPC). With this connection, your function can access the private resources of your VPC during execution like EC2, RDS and many others.

By default, AWS executes your Lambda function code securely within a VPC. Alternatively, you can enable your Lambda function to access resources inside your private VPC by providing additional VPC-specific configuration information such as VPC subnet IDs and security group IDs. It uses this information to set up elastic network interfaces which enable your Lambda function to connect securely to other resources within your VPC.

Lambda@Edge

Lets you run Lambda functions to customize content that CloudFront delivers, executing the functions in AWS locations closer to the viewer. The functions run in response to CloudFront events, without provisioning or managing servers.
You can use Lambda functions to change CloudFront requests and responses at the following points:
- After CloudFront receives a request from a viewer (viewer request)
- Before CloudFront forwards the request to the origin (origin request)
- After CloudFront receives the response from the origin (origin response)
- Before CloudFront forwards the response to the viewer (viewer response)

You can automate your serverless application’s release process using AWS CodePipeline and AWS CodeDeploy.
Lambda will automatically track the behavior of your Lambda function invocations and provide feedback that you can monitor. In addition, it provides metrics that allow you to analyze the full function invocation spectrum, including event source integration and whether downstream resources perform as expected.

AWS Lambda SnapStart

Lambda SnapStart speeds up your Java applications by reusing a single initialized snapshot to quickly resume multiple execution environments.
You can use the Lambda SnapStart for Java feature to decrease the cold start time required without provisioning additional resources. This also removes the burden of implementing complex performance optimizations for your Java application

AWS Lambda Pricing

You are charged based on the total number of requests for your functions and the duration, the time it takes for your code to execute.

Additional AWS Lambda-related Cheat Sheets:

Validate Your AWS Lambda Knowledge

Question 1

A company is deploying the package of its Lambda function, which is compressed as a ZIP file, to AWS. However, they are getting an error in the deployment process because the package is too large. The manager instructed the developer to keep the deployment package small to make the development process much easier and more modularized. This should also help prevent errors that may occur when dependencies are installed and packaged with the function code.

Which of the following options is the MOST suitable solution that the developer should implement?

Upload the deployment package to S3.
Zip the deployment package again to further compress the zip file.
Upload the other dependencies of your function as a separate Lambda Layer instead.
Compress the deployment package as TAR file instead.

Show me the answer!

Correct Answer: 3

You can configure your Lambda function to pull in additional code and content in the form of layers. A layer is a ZIP archive that contains libraries, a custom runtime, or other dependencies. With layers, you can use libraries in your function without needing to include them in your deployment package.

Layers let you keep your deployment package small, which makes development easier. You can avoid errors that can occur when you install and package dependencies with your function code. For Node.js, Python, and Ruby functions, you can develop your function code in the Lambda console as long as you keep your deployment package under 3 MB.

A function can use up to 5 layers at a time. The total unzipped size of the function and all layers can’t exceed the unzipped deployment package size limit of 250 MB.

You can create layers, or use layers published by AWS and other AWS customers. Layers support resource-based policies for granting layer usage permissions to specific AWS accounts, AWS Organizations, or all accounts. Layers are extracted to the /opt directory in the function execution environment. Each runtime looks for libraries in a different location under /opt, depending on the language. Structure your layer so that function code can access libraries without additional configuration.

Hence, the correct answer is to upload the other dependencies of your function as a separate Lambda Layer instead.

Uploading the deployment package to S3 is incorrect. Although you can upload large deployment packages of over 50 MB in size via S3, your function will still be in a single layer. This doesn’t meet the requirement of making the deployment package small and modularized. You have to use Lambda Layers instead.

Zipping the deployment package again to further compress the zip file is incorrect because doing this will not significantly make the ZIP file smaller.

Compressing the deployment package as TAR file instead is incorrect. Although it may decrease the size of the deployment package, it is still not enough to totally solve the issue. A compressed TAR file is not significantly smaller as compared to a ZIP file.

References:
https://docs.aws.amazon.com/lambda/latest/dg/configuration-layers.html
https://docs.aws.amazon.com/lambda/latest/dg/limits.html

Note: This question was extracted from our AWS Certified Developer Associate Practice Exams.

Question 2

A financial company has several AWS accounts that fetch market data from several cryptocurrency exchanges. The company is planning to launch a centralized logging ingestion system that automatically converts the incoming application log files to Apache Parquet format and stores the logs in an Amazon S3 bucket for easier processing.

The data engineer has been tasked to ensure that the log files must be delivered in near real-time to provide accurate crypto market statistics.

Which of the following options can meet this requirement LEAST operational overhead?

Use the Amazon Managed Workflows for Apache Airflow (Amazon MWAA) to forward the log files to an Amazon S3 bucket which will automatically transform the logs into Apache Parquet format.
Set up the solution to send the cryptocurrency log files to Amazon Data Firehose. Set up the Data Firehose to trigger a Lambda function that converts the log files to Apache Parquet format and delivers the files to the centralized S3 bucket.
Modify the solution to forward the cryptocurrency log files to Amazon Kinesis Data Streams and install the Kinesis Client Library on an Auto Scaling group of Amazon EC2 instances. Configure the EC2 instances to fetch the stream records and automatically convert the log files to Apache Parquet. Store the processed log files in Amazon S3.
Refactor the solution to send the cryptocurrency log files to Apache Hive on an Amazon EMR cluster. Launch a table from the log files by using a custom regular expression (regex). Set up an external table on Amazon S3 in Hive with the file format set to Apache Parquet and schedule a HiveQL UNLOAD query to persist the log files to the external Amazon S3 table.

Show me the answer!

Correct Answer: 2

Amazon Data Firehose can convert the format of your input data from JSON to Apache Parquet or Apache ORC before storing the data in Amazon S3. Parquet and ORC are columnar data formats that save space and enable faster queries compared to row-oriented formats like JSON. If you want to convert an input format other than JSON, such as comma-separated values (CSV) or structured text, you can use AWS Lambda to transform it to JSON first.

Amazon Data Firehose is a fully managed service for delivering real-time streaming data to destinations such as Amazon Simple Storage Service (Amazon S3), Amazon Redshift, Amazon OpenSearch Service, Amazon OpenSearch Serverless, Splunk, and any custom HTTP endpoint or HTTP endpoints owned by supported third-party service providers, including Datadog, Dynatrace, LogicMonitor, MongoDB, New Relic, Coralogix, and Elastic.

Hence, the correct answer is: Set up the solution to send the cryptocurrency log files to Amazon Data Firehose. Set up the Data Firehose to trigger a Lambda function that converts the log files to Apache Parquet format and delivers the files to the centralized S3 bucket.

The option that says: Use the Amazon Managed Workflows for Apache Airflow (Amazon MWAA) to forward the log files to an Amazon S3 bucket which will automatically transform the logs into Apache Parquet format is incorrect because Amazon Managed Workflows for Apache Airflow is primarily used as a managed orchestration service for Apache Airflow and not for converting log files into Apache Parquet format.

The option that says: Modify the solution to forward the cryptocurrency log files to Amazon Kinesis Data Streams and install the Kinesis Client Library on an Auto Scaling group of Amazon EC2 instances. Configure the EC2 instances to fetch the stream records and automatically convert the log files to Apache Parquet. Store the processed log files in Amazon S3 is incorrect. Take note that the scenario explicitly mentioned that the solution should have the least operational overhead in terms of managing the AWS resources. Having an Auto Scaling group of Amazon EC2 instances entails a lot of management and upkeep which do not conform to the aforementioned requirement.

The option that says: Refactor the solution to send the cryptocurrency log files to Apache Hive on an Amazon EMR cluster. Launch a table from the log files by using a custom regular expression (regex). Set up an external table on Amazon S3 in Hive with the file format set to Apache Parquet and schedule a HiveQL UNLOAD query to persist the log files to the external Amazon S3 table is incorrect. Just like the previous option, using an EC2-based Amazon EMR cluster requires maintenance and certain operational tasks. Using Amazon EMR could be a possible solution if the option mentioned that it uses the Amazon EMR Serverless type.