Last updated on January 3, 2026
Amazon MSK Cheat Sheet
-
A service that uses fully managed Apache Kafka to ingest and process streaming data in real-time.
Concepts
- Configuration
- If you do not specify a custom MSK configuration, a default configuration will be assigned to a cluster.
- You can use the custom configuration to new or existing MSK clusters.
- MSK configurations allow you to specify the properties to be set as well as the values to be assigned to them.
- Amazon MSK now supports Apache Kafka versions 3.8.x, 3.9.x, 4.0.x, and 4.1.x, with details on compatibility and upgrade procedures.
- MSK Serverless
- A cluster type that enables you to run Apache Kafka without the need to manage or scale cluster capacity.
- MSK Serverless now supports high partition counts, enabling more scalable streaming workloads per cluster.
- Automatically provision and scale capacity while managing the partitions in your topic.
- Integrated with the following services:
- AWS PrivateLink – provide private connectivity.
- AWS IAM – for authentication and authorization.
- AWS Glue Schema Registry – for schema management.
- Amazon Kinesis Data Analytics – for Apache Flink-based stream processing.
- AWS Lambda – for event processing.
- To modify topic-level configuration, use Apache Kafka Commands.
- MSK Connect
- Enables you to stream data to and from Apache Kafka clusters.
- Deploy connectors built for Kafka Connect that allow you to move data into or pull data from data stores (S3 and OpenSearch Service).
- A connector continuously copies data from a streaming data source or from a cluster into a data sink.
- Source connectors – import data from external systems into your topics.
- Sink connectors – export data from your topics to external systems.
- MSK Connect now supports EventBridge as a sink connector for streaming events from Kafka topics.
- A worker is a JVM process that runs the connector logic.
- Each worker creates a set of tasks that can operate in parallel threads and copy the data.
- The total capacity of a connector is determined by the number of workers and the number of MSK Connect Units (MCUs) per worker.
- The two capacity modes are:
- Provisioned – number of workers and MCUs per worker.
- Autoscaled – minimum and maximum number of workers.
- A plugin contains the code that defines the logic of the connector. You can use the same plugin to create one or more connectors.
- A configuration provider allows you to specify variables in a connector or worker configuration instead of plaintext, and workers running in your connector resolve these variables at runtime.
- To allow Amazon MSK Connect to access the internet, you can use Amazon VPC and set up a NAT gateway or NAT instance.
- MSK Connect now exposes
SinkConsumerByteRateandSourceProducerByteRatemetrics to monitor connector throughput. - The
UpdateConnectorAPI allows updating existing MSK Connect connector configurations without creating new connectors.
- Connecting to an Amazon MSK cluster
- By default, an MSK cluster can only be accessed by clients who are in the same VPC as the cluster.
- MSK supports Standard and Express brokers. Express brokers have updated throughput quotas and new high-partition support.
- MSK Replicator now supports
WriteDataIdempotentlypermission to ensure reliable replication between clusters. - If you want to connect your MSK cluster from a client that’s outside the cluster’s VPC, you can do the following:
- Turn on public access to a cluster.
- Use VPC Peering, Direct Connect, Transit Gateway, VPN connections, REST proxies, multiple Region multi-VPC connectivity, and through EC2-Classic.
- Use a number of ports that MSK uses.
- The state of your cluster defines what actions you can and cannot perform.
- You can migrate your clusters using Apache Kafka’s MirrorMaker.
- Apache Kafka cluster to Amazon MSK
- From one MSK cluster to another
- With LinkedIn’s Cruise Control, you can rebalance the MSK cluster, detect and fix anomalies, and monitor the cluster’s state and health.
Amazon Managed Streaming Security
- Use IAM to control who can perform Apache Kafka operations on a cluster.
- If you add new brokers after changing a cluster’s security group, you must update the new brokers’ ENIs.
- To limit access to Apache ZooKeeper nodes, you can just assign a separate security group.
- Express brokers now support non-disruptive certificate renewals, eliminating downtime during mandatory certificate updates.
Amazon Managed Streaming Monitoring
-
You can collect metrics, monitor, and analyze clusters using Amazon CloudWatch.
-
To monitor consumer lag and identify slow or stuck consumers, use CloudWatch or open monitoring with Prometheus.
-
You can deliver the Apache Kafka broker logs to the following destination types:
-
Amazon CloudWatch Logs
-
Amazon S3
-
Amazon Data Firehose
-
-
MSK Connector continuously monitors the following:
-
Connector health and delivery state.
-
Patches and manages the underlying hardware.
-
Autoscales connectors to match changes in throughput.
-
Amazon Managed Streaming Pricing
-
You are charged for the following:
-
Every Apache Kafka broker instance.
-
The amount of storage you provide in your cluster.
-
-
MSK Serverless charges you for cluster, partition, and storage.
-
For MSK Connect, you are charged for the number and size (MCUs) of each Kafka Connect worker.
Amazon MSK Cheat Sheet References:
https://aws.amazon.com/msk/
https://docs.aws.amazon.com/msk/latest/developerguide/what-is-msk.html











