Sep 26, 2017

Masking sensitive data in Log4j 2

A growing practice across many organizations is to log as much information as is feasible, to allow for better debugging and auditing. Tools like Splunk and ELK may it even easier to index the logs, treating the them almost like databases.  However, with PCI and HIPAA standards, those same organizations may want to mask much of the data to prevent unauthorized or unprotected access to sensitive data.  In this blog post I’ll detail one potential approach to masking that data, so developers do not need to worry about filtering individual log statements.


You’re going to need to use Log4j 2 (potentially with SLF4J as well).  A sample pom.xml for just these dependencies would include the lines:

If you’re using a different logging framework, then I imagine this guide may not be very helpful.


I’m going to dive right in, as there are a few different files we need to create or modify to get log masking to work. The first file we will create is a pretty basic one. It’s going to hold all of our logging markers, so that we can tell Log4J to only run the masking on the log statements that need it. Masking our logs means we’re taking a performance hit, so we should not do it any more than we need to:

I’ve got two basic Markers in that class, one for JSON, and one for XML. You can define as many as you need — for different content types, data types, etc. For this tutorial we’re only going to be using the JSON marker.

Let’s continue by extending the LogEventPatternConverter:

Ok, let’s stop and analyze the important bits. The “ConverterKeys” value and the “NAME” field we pass to the LogEventPatternConverter define the pattern that we will include in our “log4j2.xml” config. It’s what we need to include to ever see our masking at work. I believe you cannot override the default “%m”, so we are defining our own custom pattern “cm”. In fact, we will call “cm” INSTEAD of the default “m” in our configuration.

Next, the constructor and the “newInstance()” methods are required for our converter to be properly invoked by Log4j. The “format()” method holds the crux of our work. You can see that it takes the formatted message, and returns it if we do not have any Markers for the current logging statement. If we DO have markers (like for example our JSON one), then and only then will we attempt to mask the message.

I’ve implemented a simple JSON regex replacement for the mask method, but there are many different approaches you can take: you can hydrate the JSON and replace the values based on name/path, you can inspect an object to see if it’s annotated with a “DoNotMask” annotation, or you can even define simple regex values to replace (e.g. credit cards, SSNs). The implementation I provide is meant as a proof-of-concept example, and is not prod-ready. Also, if you DO decide to implement multiple strategies for different markers, it makes sense to move that logic into specific classes (I have included everything in one file for simplicity).

As a simple demonstration of this class, let’s also include the tests:

At this point however, we are still not ready to use our class, as Log4j does not know to look for it. For this, we need to update the log4j2.xml file:

The key parts here are to update “Configuration packages” attribute to include the package (or parent) of your LogEventPatternConverter, and to replace, or append “cm” rather than “m” in the pattern. If your logs should be filtering but are instead prefixed by a “c”, then Log4j has not picked up your converter, and you should make sure that the names are correct, and that the package is included in the “Configuration” node!


So hopefully now we have everything hooked up so that our log statements can be masked. In order to take advantage of our converter, we need to log our statements with the appropriate Marker:

If all went well, you should now see your sensitive data being replaced with your mask. As a final note, if you are using Spring Boot, by default Log4J is configured BEFORE Spring Boot components and @Value fields, so if you put your fields-to-mask into a properties file, it may take some extra configuration to make sure Log4J picks them up.

Igor Shults

About the Author

Object Partners profile.

One thought on “Masking sensitive data in Log4j 2

  1. Manjunath says:

    Thank you so much, your blog really helped me.

  2. Namrata Dakua says:

    this is good example.

    Can you please also share the complete log4j2.xml file?

  3. Kannan says:

    Thanks Igor!

  4. Bineeth Baburaj says:

    Your explanation looks very helpful. Could you please provide one example on how to implement this?

  5. Fred says:

    Useless because not full. If you take all the files it doesnt run….
    Where is newIstance called????

Leave a Reply to Bineeth Baburaj Cancel reply

Your email address will not be published.

Related Blog Posts
Building Better Data Visualization Experiences: Part 1 of 2
Through direct experience with data scientists, business analysts, lab technicians, as well as other UX professionals, I have found that we need a better understanding of the people who will be using our data visualization products in order to build them. Creating a product utilizing data with the goal of providing insight is fundamentally different from a typical user-centric web experience, although traditional UX process methods can help.
Kafka Schema Evolution With Java Spring Boot and Protobuf
In this blog I will be demonstrating Kafka schema evolution with Java, Spring Boot and Protobuf.  This app is for tutorial purposes, so there will be instances where a refactor could happen. I tried to […]
Redis Bitmaps: Storing state in small places
Redis is a popular open source in-memory data store that supports all kinds of abstract data structures. In this post and in an accompanying example Java project, I am going to explore two great use […]
Let’s build a WordPress & Kernel updated AMI with Packer
First, let’s start with What is an AMI? An Amazon Machine Image (AMI) is a master image for the creation of virtual servers in an AWS environment. The machine images are like templates that are configured with […]