Asyncstream | Practical experiance in logging using Logback and Log4J2

A traveller who walks through a forest places some marks on the trees to avoid being lost in the forest. Similarly programmers add logs in order to get some help when tracing an issue. Logs are the excellent source of information if we could place it at the appropriate places in our code. As a programmer and an architect, I have been involved in many production deployments and have to look into the logs to identify the problems.

Of course, nowadays we have systems like Splunk and Elasticsearch to speed up the log analysis but the point is where do you place the logs and what are the keywords or marks you place to ease the analysis process.

When we were doing a banking portal, most of our major concerns were the security (A&A), backend system integration, UI elements etc. But as architects we were very keen on one more area, Logging - a powerful area in application development and maintainance. We were sure that this will help the support team in the future to analyse and close the issues in few hours or even minutes.

As the project was based on Java and Spring Boot, we selected the Logback logging library and the JSON format for logging. We prepared a small guideline for the developers regarding what data needed to be logged and where it should be placed. This helped us to avoid logging too many unnecessary things which could make the system slow in responding to the user actions.

What are useful places to add audit logs

1. The Gate

This is the entry point to your application. This could be a Filter, an intercepter or a REST resource method. Here, we can add the log to identify the invoked URL and the channel (Customer Portal, Mobile App, Employee Portal) through which the call reached the application.

2. The Content Scanner

This is a filter where we log the content of the request. In some cases the request or response might contain very sensitive data and in some other cases the request and response of the called URLs should not be logged. The scanner area decides what contents are to be masked or ignored when logging the request content. We could also use a PatternLayout in the configuration to mask sensitive data.

3. Authentication Filter

In this filter we can identify the authenticated user and add the userid or username in the logging context. In case of login or logout action it is good to add a log to identify the logged in and logged out time. If in case you need to mask the userId, then generate a hash value of the userId using murmur2 or murmur3 hashing algorithm.

4. The Command

Each action in the application is linked to its respective command class/es which invokes the business objects to complete the underlying business operation. We can add the logs at the entry and exit points of this command class to identifiy the time taken by each business operation. (In our project we initially thought that this is enough to identify the round trip time taken for a business operation but later we had to add an intercepter which calculates the actual round trip time of a backend call)

5. Exception Handlers

As everyone knows, exception handlers are the right place to add exception related logs. Here we can not just push the stacktrace but also add additional information like the channel and the business operation which generates the exception.

How did we place the logs

We identified the useful places to add the logs. Then the challenge we faced was, how to group the logs (Fish Tagging) generated as part of a user request and those generated in a user logged-in session or may be in a browser session. Various logging libraries provides options to handle this problem. We could go through the option provided by the famous Logback and Log4J2 libraries.

SLF4J/Logback

    MDC.put("request-id", UUID.randomUUID());
    // The session id could be embedded in a cookie or in the token
    MDC.put("session-id", userSession.getSessionId());
    MDC.put("channel", userSession.getChannel());
    // In the Login/Authentication Filter
    MDC.put("user-id", M3.hash(userSession.getUserId()))

Log4J2

    ThreadContext.put("request-id", UUID.randomUUID());
    // The session id could be embedded in a cookie or in the token
    ThreadContext.put("session-id", userSession.getSessionId());
    ThreadContext.put("channel", userSession.getChannel());
    // In the Login/Authentication Filter
    ThreadContext.put("user-id", M3.hash(userSession.getUserId()))

The challenges

Since the server is processing the request using threads from a thread pool, the values in the MDC/ThreadContext has to be cleared after each request is processed/ or before the processing of each request starts. This could be easily handled using MDC.clear() and ThreadContext.clear() methods provided in the respective libraries.
Async operations starts its own thread so that the MDC in the parent thread will not be available in the child thread. In this case we need to pass the common values in the parent ThreadContext/MDC to the child thread so that it could create the MDC/TreadContext with the common values.

Conclusion

These steps helped us to ingest the logs to ELK stack and extract additional derived attributes out of these logs and presented it in a nice Kibana dashboard. The placing of appropriate logs at right places helped us to solve many disputes and cases without much delay.