Skip to content

Network Canary Down

Anandkumar Patel edited this page Apr 25, 2016 · 11 revisions

network canary investigation

  1. look in loggly for message https://sandboxes.loggly.com/search#terms=%22failed%20to%20ping%20a%20container%22%20%22production-delta%22%20&from=2016-04-25T20%3A51%3A01.715Z&until=2016-04-25T21%3A01%3A01.715Z&source_group= That shows ips that where unsuccessful

  2. In mongo get list of containers we are supposed to ping

db.instances.find({ 
  'container.inspect.State.Running': true,
  'owner.github': ORG_ID, 
}, {
  'network.hostIp': 1,
  'name': 1,
  'container.dockerHost': 1
})

Ensure we did not ping something we are not supposed to

  1. ssh into those docker hosts, if host does not exist mark dock as unhealthy as it does not exist ...

  2. check weave status

weave status
weave status ipam
weave status peers
weave status connections
  1. look at weave logs for errors
docker logs weave 2>&1 | grep -q -m1 "no such device" && echo BAD || echo OK

if you see no such device weave is hosed

  1. kill weave container
docker kill weave

Clone this wiki locally