Network services based on distributed components are increasingly attracting attention, because they can provide advanced network services with lower cost by assigning and reusing useful components running on remote nodes. In such a scheme, even if a network service component fails, a component on another node can be substituted to run the same function.
But the failure point in network services is not only at component but also other points. There are three kinds of failures in network services: software component failures, hardware node failures, and networking equipment hardware failures. In general, it is difficult to specify a failure occurring in different layers in a unified way using failure detection methods.
We propose a new method to identify failures in network services that determines failures at the application level and also those occurring in the network layer by collecting a small number of messages with cooperation between multiple overlay networks.