Amazon Elastic Inference
Carlo Acebedo2024-11-14T05:29:14+00:00Amazon Elastic Inference Cheat Sheet Allows attaching low-cost GPU-powered inference acceleration to EC2 instances, SageMaker instances, or ECS tasks. Reduce machine learning inference costs by up to 75%. Common Use Cases Computer vision Natural language processing Speech recognition Concepts Accelerator A GPU-powered hardware device provisioned. It is not a part of the hardware where your instance is hosted. Uses AWS PrivateLink endpoint service to attach to the instance over the network. Only a single endpoint service is required in every Availability Zone to connect Elastic Inference accelerator to instances. Features Supports TensorFlow, Apache MXNet, PyTorch, and ONNX models. Can provide [...]