Discover community-driven integrations, libraries, and resources to extend Datadog across your stack. Filter by platform, data type, use case, and more.
Monitor OpenStack v13+ environments from the controller node, collecting resource usage metrics for hypervisors, virtual machines, and networking (Neutron) components.
The Datadog TorchServe integration enables comprehensive monitoring of your TorchServe instances by collecting metrics, events, and logs from the Inference API, Management API, and OpenMetrics endpoints. Track the overall health status, model performance, and custom metrics, and receive alerts on key events such as model additions or removals. This integration supports flexible configuration for hosts, Docker, and Kubernetes environments, helping you ensure your TorchServe deployments are performing optimally and issues are detected quickly.
SIOS AppKeeper automatically restarts failed Amazon EC2 services in response to Datadog alerts, reducing downtime and eliminating the need for manual recovery.