Exception Management

Dealing With Exceptions

Dealing with undesired situations shows characters of a person. Similarly how applications (which serve business) respond to failures, treat errors and gracefully handles crisis defines the character of the application.

Micro-services architecture is based on distributed transaction involving many independent micro services.

These microservices themselves have got many cross cutting concerns (logging, profiling etc), typically microservices have their own persistence, services talk to other services within the ecosystem or talk with third party application.

While this is true, there is a possibility of any of these component may go down or might not respond or respond with errors. These possibility of errors/exceptions need to be considered while building the architecture and each micro-service.

How Exceptions and Error Are Dealt

It starts with identifying point of failures and possible/known failures/exceptions that can occur.

Make sure that all those errors are recorded.

Each exception and error has a treatment defined. E.g. on TimeoutException there could be multiple retries, on error routing requests to alternate service, convert and wrap the error into a friendly message that the caller/user can act on.

  • There is a analysis of the errors recorded and development team learns of new exception types or causes.
  • Team works on fixing the issues or lets the owner of the service know and get those addressed.
  • There must be a clear declaration of who (component) is resposible for handling the exception.
  • There should be a global handler so incase the exception is not addressed in the chain, it would act on it.
  • Team should maintain the error/exception type, treatment types and messages for those.

There may be initial teething issues when the component goes live but over a period of time, component should become robust.

Additional Helping Hands

In the adverse situation, when all the system is down the appropriate web page must be served letting user know of system unavailability.

Like pulling down portcullis.

When infrastructure is also treated as a code, it can also contribute to manage failures.

API gateway can address failures and at times use circuit breaker to prevent more failures and provide better quality of service.

Exception Management leverages the centralized logging, monitoring and alerting mechanism put in place.