Explore Integrations and Libraries

Discover and build community-driven integrations and libraries to extend Datadog across your stack. Filter by platform, data type, use case, and more.

Showing 18 of 45 Results

agent-integration

Nvidia NIM

NVIDIA NIM integration with Datadog enables real-time GPU observability by collecting Prometheus metrics for monitoring.

  • Datadog help@datadoghq.com

    https://www.datadoghq.com

agent-integration

Nvidia Triton

Monitor NVIDIA Triton Inference Server performance and health metrics using the Datadog Agent, enabling visibility into inference workloads, resource utilization, and server status.

  • Datadog help@datadoghq.com

    https://www.datadoghq.com

agent-integration

Ray

Monitor Ray, an open-source unified compute framework for scaling AI and Python workloads, to track health, performance, and resource utilization.

  • Datadog help@datadoghq.com

    https://www.datadoghq.com

agent-integration

TorchServe

The Datadog TorchServe integration enables comprehensive monitoring of your TorchServe instances by collecting metrics, events, and logs from the Inference API, Management API, and OpenMetrics endpoints. Track the overall health status, model performance, and custom metrics, and receive alerts on key events such as model additions or removals. This integration supports flexible configuration for hosts, Docker, and Kubernetes environments, helping you ensure your TorchServe deployments are performing optimally and issues are detected quickly.

  • Datadog help@datadoghq.com

    https://www.datadoghq.com