Automating Testing for Generated HTML Content using Gradle

We are going to discuss generated HTML content and how to validate it in an
automated fashion. The results will be appropriate to work into your
continuous integration (CI) pipeline.

In my experience the pendulum has swung a few times between dynamic HTML content
and statically generated content. Without having a full discussion of the positives
and negatives of each approach, there are cases where using a generated site is
useful.

There are many solutions for generating HTML content, but we’ll focus on one
tool since we’re primarily interested in how to validate the HTML output. Grain is
a Groovy based site generator that uses Gradle for its build system.

htmlSanityCheck Gradle Plugin

The org.aim42.htmlSanityCheck Gradle plugin is an easy way to validate HTML in
your Gradle build. It has a number of rules, is configurable, and will output
JUnit style reports that can be consumed by most CI tools.

Step 1: Create Grain Project

  1. Clone the Octopress Grain theme from https://github.com/double16/grain-theme-octopress
  2. ./grainw
  3. Open http://localhost:4000 in a web browser

Step 2: The htmlSanityCheck Plugin

The htmlSanityCheck plugin is easy to add to this project. It already exists in the repo
referenced above. The parts of interest that you need to add to your own build.gradle follow:

plugins {
  id "org.aim42.htmlSanityCheck" version "0.9.6"
}

htmlSanityCheck {
    // generate is provided by grain, we want the HTML generated before we check it
    dependsOn generate
    // target is the default folder for grain content
    sourceDir # file('target')
    // set the location of the results, this can be used for any project
    checkingResultsDir # file("${buildDir}/reports/htmlchecks")
    // validate external links, it will take longer, adjust as desired
    checkExternalLinks # true
}

// validate HTML as part of the standard Gradle check process
check.dependsOn htmlSanityCheck

Step 3: Generate and Validate HTML

Validation can be run by itself or as part of the Gradle check task.

./gradlew check

OR

./gradlew htmlSanityCheck

If the build fails, open the build/reports/htmlchecks/index.html file to inspect the failures. More details can
be seen during the build by using the --info option.

./gradlew check --info
...
==================================================
Summary for file index.html

page path  : grain-theme-octopress/target/blog/2014/01/08/pullquote-tag/index.html
page title : Pullquote tag - Octopress theme for Grain
page size  : 16062 bytes


--------------------------------------------------
Results for Duplicate Definition of id Check
4 id checked,
0 duplicate id found.

--------------------------------------------------
...

Sample Output

  • XML output is JUnit compatible so that tools that ingest JUnit reports will work with htmlSanityCheck. Output is in
    the build/test-results/htmlchecks/ folder in the root of the project.

  • The HTML report looks clean and similar to JUnit HTML reports.

sample report

Summary

“Test everything!” applies to your generated HTML. Not only can the HTML be checked
for correctness and valid links, but standards such as accessibility, can also be
validated before publishing the content.

The htmlSanityCheck tool is a great way to test your generated HTML from Gradle or
other build tools. See https://github.com/aim42/htmlSanityCheck for full documentation.

About the Author

Patrick Double profile.

Patrick Double

Principal Technologist

I have been coding since 6th grade, circa 1986, professionally (i.e. college graduate) since 1998 when I graduated from the University of Nebraska-Lincoln. Most of my career has been in web applications using JEE. I work the entire stack from user interface to database.   I especially like solving application security and high availability problems.

Leave a Reply

Your email address will not be published.

Related Blog Posts
Building Better Data Visualization Experiences: Part 1 of 2
Through direct experience with data scientists, business analysts, lab technicians, as well as other UX professionals, I have found that we need a better understanding of the people who will be using our data visualization products in order to build them. Creating a product utilizing data with the goal of providing insight is fundamentally different from a typical user-centric web experience, although traditional UX process methods can help.
Kafka Schema Evolution With Java Spring Boot and Protobuf
In this blog I will be demonstrating Kafka schema evolution with Java, Spring Boot and Protobuf.  This app is for tutorial purposes, so there will be instances where a refactor could happen. I tried to […]
Redis Bitmaps: Storing state in small places
Redis is a popular open source in-memory data store that supports all kinds of abstract data structures. In this post and in an accompanying example Java project, I am going to explore two great use […]
Let’s build a WordPress & Kernel updated AMI with Packer
First, let’s start with What is an AMI? An Amazon Machine Image (AMI) is a master image for the creation of virtual servers in an AWS environment. The machine images are like templates that are configured with […]