Sysdig Monitor AWS Metrics

This page lists the metrics available for supported AWS services in your infrastructure along with a brief description of each. To display them, go to the Explore page and make a selection from the AWS Services list. Clicking any instance within the chosen service will open a detailed drill-down view for that instance. Choose from the Metrics column displayed to the left of the drilldown window. Metrics are constantly being added and this list will be updated over time.

Categories

AWS Metrics are grouped in the following categories:

Name Description
Elastic Compute Cloud Metrics coming from Amazon Elastic Compute Cloud
Elastic Load Balancing Metrics coming from Amazon Elastic Load Balancing
ElastiCache Metrics coming from Amazon ElastiCache
Relational Database Service Metrics coming from Amazon Relational Database Service
Simple Queue Service Metrics coming from Amazon Simple Queue Service
DynamoDB NoSQL database service Metrics coming from Amazon DynamoDB

 

Elastic Compute Cloud

Top

Metrics coming from Amazon Elastic Compute Cloud

CPU %

aws.ec2.CPUUtilization

The percentage of allocated EC2 compute units that are currently in use on the instance.
Usage: This metric identifies the processing power required to run an application upon a selected instance.

Disk Read

aws.ec2.DiskReadBytes

Bytes read from all ephemeral disks available to the instance.
Usage: This metric is used to determine the volume of the data the application reads from the hard disk of the instance and can be used to determine the speed of the application.

Disk Read Operations

aws.ec2.DiskReadOps

Completed read operations from all ephemeral disks available to the instance in a specified period of time.

Disk Write

aws.ec2.DiskWriteBytes

Bytes written to all ephemeral disks available to the instance.
Usage: This metric is used to determine the volume of the data the application writes to the hard disk of the instance and can be used to determine the speed of the application.

Disk Write Operations

aws.ec2.DiskWriteOps

Completed write operations to all ephemeral disks available to the instance in a specified period of time. If your instance uses Amazon EBS volumes, see Amazon EBS Metrics.

Network In

aws.ec2.NetworkIn

The number of bytes received on all network interfaces by the instance.

Network Out

aws.ec2.NetworkOut

The number of bytes sent out on all network interfaces by the instance.
Usage: This metric identifies the volume of outgoing network traffic to an application on a single instance.
 

Elastic Load Balancing

Top

Metrics coming from Amazon Elastic Load Balancing

Backend Connection Errors

aws.elb.BackendConnectionErrors

The number of errors encountered by the load balancer while attempting to connect to your application.
Usage: For high error counts, look for network related issues or check that your servers are operating correctly. The ELB is having problems connecting to them.

Healthy Hosts

aws.elb.HealthyHostCount

A count of the number of healthy instances that are bound to the load balancer
Usage: Hosts are declared healthy if they meet the threshold for the number of consecutive health checks that are successful. Hosts that have failed more health checks than the value of the unhealthy threshold are considered unhealthy. If cross-zone is enabled, the count of the number of healthy instances is calculated for all Availability Zones.

Backend HTTP 2XX

aws.elb.HTTPCode_Backend_2XX

The count of the number of HTTP 2XX response codes generated by back-end instances. This metric does not include any response codes generated by the load balancer.
Usage: The 2XX class status codes represent successful actions (e.g., 200-OK, 201-Created, 202-Accepted, 203-Non-Authoritative Info).

Backend HTTP 3XX

aws.elb.HTTPCode_Backend_3XX

The count of the number of HTTP 3XX response codes generated by back-end instances. This metric does not include any response codes generated by the load balancer.
Usage: The 3XX class status code indicates that the user agent requires action (e.g., 301-Moved Permanently, 302-Found, 305-Use Proxy, 307-Temporary Redirect).

Backend HTTP 4XX

aws.elb.HTTPCode_Backend_4XX

The count of the number of HTTP 4XX response codes generated by back-end instances. This metric does not include any response codes generated by the load balancer.
Usage: The 4XX class status code represents client errors (e.g., 400-Bad Request, 401-Unauthorized, 403-Forbidden, 404-Not Found).

Backend HTTP 5XX

aws.elb.HTTPCode_Backend_5XX

The count of the number of HTTP 5XX response codes generated by back-end instances. This metric does not include any response codes generated by the load balancer.
Usage: The 5XX class status code represents back-end server errors e.g., 500-Internal Server Error, 501-Not implemented, 503-Service Unavailable).

HTTP 4XX

aws.elb.HTTPCode_ELB_4XX

The count of the number of HTTP 4XX client error codes generated by the load balancer when the listener is configured to use HTTP or HTTPS protocols.
Usage: Client errors are generated when a request is malformed or is incomplete.

HTTP 5XX

aws.elb.HTTPCode_ELB_5XX

The count of the number of HTTP 5XX server error codes generated by the load balancer when the listener is configured to use HTTP or HTTPS protocols. This metric does not include any responses generated by back-end instances.
Usage: The metric is reported if there are no back-end instances that are healthy or registered to the load balancer, or if the request rate exceeds the capacity of the instances or the load balancers.

Request Latency

aws.elb.Latency

A measurement of the time backend requests require to process.
Usage: Latency metrics from the ELB are good indicators of the overall performance of your application.

Requests

aws.elb.RequestCount

The number of requests handled by the load balancer.

Request Spillover

aws.elb.SpilloverCount

A count of the total number of requests that were rejected due to the queue being full.
Usage: Positive numbers indicate some requests are not being forwarded to any server. Clients are not notified that their request was dropped.

Request Surge Queue

aws.elb.SurgeQueueLength

A count of the total number of requests that are pending submission to a registered instance.
Usage: Positive numbers indicate clients are waiting for their requests to be forwarded to a server for processing.

Unhealthy Hosts

aws.elb.UnHealthyHostCount

The count of the number of unhealthy instances that are bound to the load balancer.
Usage: Hosts are declared healthy if they meet the threshold for the number of consecutive health checks that are successful. Hosts that have failed more health checks than the value of the unhealthy threshold are considered unhealthy.
 

ElastiCache

Top

Metrics coming from Amazon ElastiCache

CPU Utilization

aws.elasticache.CPUUtilization

The percentage of CPU utilization.
Tip: When reaching high utilization and your main workload is from read requests, scale your cache cluster out by adding read replicas. If the main workload is from write requests, scale up by using a larger cache instance type.

Freeable Memory

aws.elasticache.FreeableMemory

Memory considered free or capable of being made available to be used by the node.

Network Bytes In

aws.elasticache.NetworkBytesIn

The number of bytes the host has read from the network.

Network Bytes Out

aws.elasticache.NetworkBytesOut

The number of bytes the host has written to the network.

Swap Usage

aws.elasticache.SwapUsage

The amount of swap space used on the host.
Tip: If swap is being utilized, the node probably needs more memory than is available and cache performance may be negatively impacted. Consider adding more nodes or using larger ones to reduce or eliminate swapping.
 

Relational Database Service

Top

Metrics coming from Amazon Relational Database Service

DB Bin Log

aws.rds.BinLogDiskUsage

The amount of disk space occupied by binary logs on the master. Applies to MySQL read replicas.

CPU %

aws.rds.CPUUtilization

The percentage of CPU utilization.

DB Connections

aws.rds.DatabaseConnections

The number of database connections in use.

Disk Queue Depth

aws.rds.DiskQueueDepth

The number of outstanding IOs (read/write requests) waiting to access the disk.

Memory Free %

aws.rds.FreeableMemory

The amount of available random access memory.

Storage Free

aws.rds.FreeStorageSpace

The amount of available storage space in bytes.

Network In

aws.rds.NetworkReceiveThroughput

The incoming (Receive) network traffic on the DB instance, including both customer database traffic and Amazon RDS traffic used for monitoring and replication.

Network Out

aws.rds.NetworkTransmitThroughput

The outgoing (Transmit) network traffic on the DB instance, including both customer database traffic and Amazon RDS traffic used for monitoring and replication.

Disk Read IOPS

aws.rds.ReadIOPS

The average number of disk I/O operations per second.

Disk Read Latency

aws.rds.ReadLatency

The average amount of seconds taken per disk I/O operation.

Disk Read Throughput

aws.rds.ReadThroughput

The average number of bytes read from disk per second.

DB Replica Lag

aws.rds.ReplicaLag

The amount of time a Read Replica DB Instance lags behind the source DB Instance. Applies to MySQL read replicas.

Swap Usage

aws.rds.SwapUsage

Disk Write IOPS

aws.rds.WriteIOPS

The ReplicaLag metric reports the value of theSeconds_Behind_Masterfield of the MySQLSHOW SLAVE STATUScommand.

Disk Write Latency

aws.rds.WriteLatency

The average amount of time taken per disk I/O operation.

Disk Write Throughput

aws.rds.WriteThroughput

The average number of bytes written to disk per second.
Have more questions? Submit a request