Skip to content

Monitoring ElastiCache redis is broken #571

@michaelwittig

Description

@michaelwittig

Monitoring ElastiCache redis is not possible in an elegant way at the moment in CloudFormation. The CacheClusterId CloudWatch metric dimension is constructed in a way that prevents us from creating alarms for the relevant metrics.

  • For cluster mode disabled replication groups (NumShards = 1), the following CacheClusterIds are used: ${ReplicationGroup}-NNN (NNN e.g., 001, 002, ..., 006 for each replica)
  • For cluster mode enabled replication groups (NumShards > 1), the following CacheClusterIds are used:
    ${ReplicationGroup}-MMMM-NNN (MMMM e.g., 00001, 0002, ... for each node group/shard id)

The following alarm would solve the issue. Unfortunately, search expressions are not supported in Alarms yet...

CPUUtilizationTooHighAlarm:
   Type: 'AWS::CloudWatch::Alarm'
   Properties:
     AlarmDescription: !Sub 'Average CPU utilization over last 10 minutes higher than ${CPUUtilizationThreshold}%'
     ComparisonOperator: GreaterThanThreshold
     EvaluationPeriods: 1
     Metrics:
     - Expression: !Sub 'SEARCH(''{AWS/ElastiCache, CacheClusterId} ${ReplicationGroup} "CPUUtilization"'', ''Average'', 600)'
       Id: 'e1'
       Label: 'e1'
       ReturnData: false
     - Expression: 'MAX(e1)'
       Id: 'e2'
       Label: 'e2'
       ReturnData: true
     Threshold: !Ref CPUUtilizationThreshold
     AlarmActions:
     - 'Fn::ImportValue': !Sub '${ParentAlertStack}-TopicARN'

We don't have loops in CloudFormation either.

Since we allow up to 250 Shards with up to 5 replicas each we would need too many conditions and it would bloat the template in a massive way.

Not sure what we can do here...

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions