Profiling in MicroServices

Measuring System Performance

Business for being successful require a working piece software which serves the customer. This though is an inadequate statement. Customer need quicker service (response). To achieve this, we can have multiple servers and distribute the traffic. Simple isn’t it.

Have many servers serving customers i.e. over provision the customer, that way there is no complain. 🙂

Well that’s neither desirable for any business nor feasible at time for implementation team.

Lets consider an example.

Story Of Optimum Usage

Imagine a multi story restaurant, with many waiters who are allocated specific table numbers on the respective floor.

Now there are two challenges.

One the restaurant many not be full at all time or all of the floor may not be occupied. This is a waste of resources i.e. waste of money/cost.

Second, if there are set of tables have a great view, then people would occupy those set of tables only. An individual waiter who is serving that area still remains busy since he is the only one who is serving and customers have to wait for him since they are blocked on him. While all other waiters are waiting and waiting and sitting idle.

Similarly we can think of software and infrastructure systems serving the business functionality to the customers.

Trouble With Over Provisioning

If the infrastructure is over provisioned to the max load serving capacity then operational costs are huge.

Secondly, despite of providing hardware capacity, software may not be designed to handle many requests in parallel and give response quicker.

In monolith, its very challenging to scale the system on demand. Typically as the software is not modularized so installing on different machine is not possible or if at it is possible then it mostly breaks.

Increasing hardware capacity (vertical scaling) of the existing servers where monoliths is the option that remains. Vertical may lead to downtime during the process of scaling or an incorrect implementation would lead to bigger troubles.

Micro-services are autonomous and independent hence fine tuning, optimizing each service is possible.

Coupled with elastic nature of infrastructure allows scaling up and down the infrastructure on demand or on defined parameters (number or requests or time frame in a day).

How To Identify What To Optimize

However micro-services architecture is distributed in nature, spanning across multiple services, multiple servers, data centers.

This presents the challenge for identifying the challenge in gathering all different data element which can help in identifying the performance bottle necks.

Also the data elements which can help operations team to understand infrastructure utilization and guide on configuration of the servers.

Real riddle to solve is how to achieve constant performance improvement and resource optimization in such a large distributed architecture.

Identify the bottle necks or optimizing resources needs key data elements recorded at a point in time. Also such a data needs to be recorded over a period of time so a decision can be taken based on a pattern.

From an service application perspective, in order to identify the pieces of code which should be optimized, the development needs detailed data on how that service behaves in production with actual load and interactions from users.

Key Data Measured In Profiling

Data that can be gathered from the service is

Service latency : time taken by service to respond)

Method latency : time taken by key code snippets doing critical operations/logic)

Third party service latency: how long the third party took to respond

Payload size: response time might vary as per request and response size

Errors and timeouts during the execution.

If application is querying to a database or executing a procedure then the call the set of queries took or a specific query took should be recorded.

While we are looking at service behavior, we can also take a peek into infrastructure and server data such as

  • CPU utilization percentage
  • Network latency and throughput
  • Disk usage vs free
  • Memory usage

Some data that can be collected from server is

  • Thread contention
  • Garbage collection
  • Connection pooling
  • Database interactions

Overall end to end latency should be measured and then should be broken down at each layer/component.

Then each component can further improved upon.

In order to do this, instrumentation should be done in the service code and needed data from server and infrastructure components must be fetched.

This is called as profiling.

What Is Profiling

Profiling helps in micro-services architecture to optimize code, servers and hardware capacity.

Profiling is a graphical or other representation of information relating to particular characteristics of something, recorded in quantified form.

Profiling provides an inventory of performance events and timings for the execution as a whole.

Profiling information gathered from all sources can be plotted on a graph and can quickly analyzed.

There are many profiling tools available. There are code profilers, database query process.

Dynatrace, JProfiler

There are a lot of performance optimization helper tools available which provide an apt solution for solving performance bottlenecks.

For optimization a lot of techniques are used such as caching, distributed computing, load balancing and on demand horizontal scaling etc.

Also functional programming languages like Scala, Clojure, Haskell etc have been chosen these days to implement parallelism, which optimizes the cost and provides easy to understand, stable code.

Need Of Metering

Metering is another important aspect. Meter mean a device/mechanism that measures the amount of something that is used.

Metering is achieved by collecting metrics from the applications and infrastructure components.

This contains the data elements that we have discussed in the profiling. It is important not only to measure the data at a point in time but also to record the data continuously for trending and analysis purpose.

This analysis can be used for pricing the product functionality and usage.

E.g. throttling can be applied i.e. 100 requests supported per day for economic plan, 300 for premium plan and 500 for business plan etc.

Application performance, stability, delivery quality(failures), hosting and maintenance cost can be obtained with the help of metering.