Last updated on November 14, 2024
Amazon Elastic Inference Cheat Sheet
- Allows attaching low-cost GPU-powered inference acceleration to EC2 instances, SageMaker instances, or ECS tasks.
- Reduce machine learning inference costs by up to 75%.
Common Use Cases
- Computer vision
- Natural language processing
- Speech recognition
Concepts
- Accelerator
- A GPU-powered hardware device provisioned.
- It is not a part of the hardware where your instance is hosted.
- Uses AWS PrivateLink endpoint service to attach to the instance over the network.
- Only a single endpoint service is required in every Availability Zone to connect Elastic Inference accelerator to instances.
Features
- Supports TensorFlow, Apache MXNet, PyTorch, and ONNX models.
- Can provide 1 to 32 trillion floating-point operations per second (TFLOPS) per accelerator.
- The accelerator attached to each instance in an auto-scaling group scales accordingly to your application’s compute demand.
Amazon Elastic Inference Pricing
- You are charged for the accelerator hours you consume.
Note: If you are studying for the AWS Certified Machine Learning Specialty exam, we highly recommend that you take our AWS Certified Machine Learning – Specialty Practice Exams and read our Machine Learning Specialty exam study guide.
Amazon Elastic Inference Cheat Sheet References:
https://aws.amazon.com/machine-learning/elastic-inference/features/
https://docs.aws.amazon.com/elastic-inference/latest/developerguide/basics.html
https://aws.amazon.com/machine-learning/elastic-inference/pricing/