How To Create Health Check For RESTful Microservice In Golang

Imagine you’ve recently released and deployed to production a cool RESTful microservice you worked on for a while. You heaved a sigh of relief just to hear from Ops team that your service is unstable. You are damn sure that the service should be fine, but you get a feeling that there could be something wrong with services it depends on. What should you do?

Health check will come to your rescue. It is an endpoint in your service returning status of your application including statuses of connections to all external services your service directly depends on. In this post I’ll show how to create a health check for a microservice running on multiple nodes, storing its state in MongoDB and calling Elasticsearch.

If you raised an eyebrow, surprised by why your service should monitor external services… You are right, external services must be monitored independently. In practice, however, some checks may be temporarily down. Nothing is more permanent than the temporary. So it’s a good practice to include your direct dependencies in service status, so you (and Ops) always know what’s broken.

Design

As I alluded earlier, imagine you have a microservice running on multiple nodes, keeping state in MongoDB and calling Elasticsearch. What health check should look like for such a service?

Let’s address the question from different aspects.

Endpoint

An easy one. Let’s follow industry naming convention and call the endpoint /health.

Format

For RESTful service, you should always return HTTP status code 200 and the state as content in JSON format.

Content

This is an interesting one. Response content must reflect health of all critical parts of the service. In our case they are nodes, connection to MongoDB and connection to Elasticsearch. Represented as Golang struct, health status may look like below.

type HealthStatus struct {
	Nodes   map[string]string `json:"nodes"`
	Mongo   string `json:"mongo"`
	Elastic string `json:"elastic"`
}

Implementation

A descriptive way to demonstrate how health check fits in a microservice is to show it together with other modules it collaborates with. A skeleton of my example will have the following modules:

main
mongo
elastic
health

main module

main module just sets up the service:

package main

import (
	"encoding/json"
	"github.com/ypitsishin/code-with-yury-examples/healthcheck/elastic"
	"github.com/ypitsishin/code-with-yury-examples/healthcheck/health"
	"github.com/ypitsishin/code-with-yury-examples/healthcheck/mongo"
	"net/http"
)

func main() {
	healthService := health.New([]string{"node1", "node2", "node3"}, mongo.New(), elastic.New())
	http.HandleFunc("/health", statusHandler(healthService))
	http.ListenAndServe("localhost:8080", nil)
}

func statusHandler(healthService health.Service) func(http.ResponseWriter, *http.Request) {
	return func(w http.ResponseWriter, r *http.Request) {
		bytes, err := json.MarshalIndent(healthService.Health(), "", "\t")
		if err != nil {
			http.Error(w, err.Error(), http.StatusInternalServerError)
			return
		}
		w.Write(bytes)
	}
}

Note that health service needs access to both mongo and elastic modules.

mongo and elastic modules

I’ll use rand package to simulate random errors occurring in MongoDB, Elasticsearch and nodes. A simple simulated mongo module is below. elastic module is similar.

package mongo

import (
	"math/rand"
	"errors"
)

type Service interface {
	Health() error
	// Business methods go here
}

func New() Service {
	return &service{}
}

type service struct {
	// Some fields
}

func (s *service) Health() error {
	if rand.Intn(2) > 0 {
		return errors.New("Service unavailable")
	}
	return nil
}

health module

And finally health module itself:

package health

import (
	"github.com/ypitsishin/code-with-yury-examples/healthcheck/mongo"
	"github.com/ypitsishin/code-with-yury-examples/healthcheck/elastic"
	"math/rand"
	"fmt"
)

type HealthStatus struct {
	Nodes   map[string]string `json:"nodes"`
	Mongo   string `json:"mongo"`
	Elastic string `json:"elastic"`
}

type Service interface {
	Health() HealthStatus
}

type service struct {
	nodes   []string
	mongo   mongo.Service
	elastic elastic.Service
}

func New(nodes []string, mongo mongo.Service, elastic elastic.Service) Service {
	return &service{
		nodes: nodes,
		mongo: mongo,
		elastic: elastic,
	}
}

func (s *service) Health() HealthStatus {
	nodesStatus := make(map[string]string)
	for _, n := range s.nodes {
		if rand.Intn(10) > 7 {
			nodesStatus[n] = "Node ERROR: Node not responding"
		} else {
			nodesStatus[n] = "OK"
		}
	}

	mongoStatus := "OK"
	if err := s.mongo.Health(); err != nil {
		mongoStatus = fmt.Sprintf("Mongo ERROR: %s", err)
	}

	elasticStatus := "OK"
	if err := s.elastic.Health(); err != nil {
		elasticStatus = fmt.Sprintf("Elastic ERROR: %s", err)
	}

	return HealthStatus{
		Nodes: nodesStatus,
		Mongo: mongoStatus,
		Elastic: elasticStatus,
	}
}

Note that error messages follow pattern <service> ERROR: <detail>. This is important as health status messages are intended to be consumed by monitoring systems, e.g. Sensu, and should be easy to parse.

Testing

Calling health check via curl

curl localhost:8080/health

outputs

{
	"nodes": {
		"node1": "OK",
		"node2": "OK",
		"node3": "OK"
	},
	"mongo": "Mongo ERROR: Service unavailable",
	"elastic": "OK"
}

Every time you run curl command may result in different output, because errors are randomised.