Google has been updating their APIs to require OAuth 2.0. One of our clients makes considerable use of the Google Webmaster Tools APIs instrumented for 14,629 individual websites (and growing). I recently had the opportunity to help this client upgrade their internal application that retrieves the data from the Google API and stores it in a relational database for their internal applications to process. Our client had been using the V2 API which is now phasing that out in favor of the V3 API, leaving no backward compatibility. Google supplies copious documentation on their developer portal, but it took time to digest that information and use it to inform a solution that would meet our client’s needs. Not only did Google change the method of authentication, but they also tinkered with their APIs, adding, as well as removing some functionality that our client had been relying on.
What I will describe in this blog post is how to build up an unattended application that will successfully utilize Google’s OAuth2 mechanism to gain access to user data. Our client needed both the GMail and Search Console (formerly Webmaster Tools) APIs, but for the sake of brevity, my example will relate only to the GMail API, which serves as an interesting and straightforward example.
The guiding principle of OAuth is to create an authentication and authorization system that is not reliant on a username/password combination being broadcast and to instead rely upon a token exchange mechanism built upon digital signature technology. OAuth is billed as a “simple way to publish and interact with protected data” and layers security around RESTful web services. If you are more familiar with SOAP-style web services, you know that authentication was basically designed in from the get-go whereas REST leaves it up to others to decide how security should be implemented.
Since 2006, OAuth has been superseded by OAuth2 which is really more of a whole solution versus OAuth which was a protocol specification.
The following flow diagram is an illustration of how OAuth2 works in the abstract:
Google has its own implementation of OAuth2 and the company is now in the process of updating all their services to use it.
The main point of interaction when setting up access to a particular API is the Google Developers Console.
Create a project:
Within the project, enable the desired APIs:
Generate an OAuth 2.0 client id and client secret:
Download the client id file as JSON:
Now that an API is enabled the client id JSON file is available locally, all the ingredients exist to build an application that can access that API. In the real application I built, I stored the client id files in the database. However the data is stored, it is critical to safeguard it from unauthorized access.
When generating the client id there is also the ability to generate a service account key. It lead me to wonder whether a service account might be a better approach for running unattended access to a Google API. The answer in general is: “It depends.” A service account is not capable of accessing any user data. Since my application needed access to a GMail inbox, I could not use a service account, although I did try it. Gaining access to user data specifically requires using a client id and the owner of the account to grant access. A service account is strictly for access to non-user-specific data such as data from the Search API, Google Maps, or YouTube.
My dilemma was that if the account owner has to give consent each time my data retrieval program runs, I’m stuck. Active user involvement is fine for one time or when the account owner is involved in using the program. In our client’s case, they have 53 accounts that they want to retrieve certain email messages from weekly and store them in a database. There are no “people” associated with these accounts who could even grant access! How do I automate that?
To solve the problem, I wrote a separate command-line program and broke out the “User” part of the OAuth2 flow diagram into a completely separate operation. I used my one-off program to perform the initial authorization.
In the code, I configured the API scopes I needed (which APIs, basically)
and the location of the client id file, and then I ran it, systematically authorizing each individual account by hand. The one nasty issue here was I had to be extra careful to ensure I was logged in to the correct account since Chrome is designed to work closely with Google accounts and is a little ‘sticky’ with the session cookies. It does no good to want e-mail data from account “email@example.com” but grant access to account firstname.lastname@example.org’s by mistake. With so many accounts in play, it was tedious to carefully log Chrome in and out of each account and grant access. Luckily, it was a one-time slog.
Upon successful granting of access, a binary “StoredCredential” file is automatically downloaded to a configured location. Inside the stored credential file is the access token that allows access to the selected APIs. The access token is good for 60 minutes. The program requires a new token each time it needs to run on a schedule. Through painstaking examination of documentation and web forums, I arrived at the sneaky way to accomplish the automation I needed: I instructed the command line program to retrieve what is called a “refresh token” at the same time it retrieved its first access token. I did this by setting the access type property to “offline” (line 5 in the following Gist):
Once in possession of a refresh token in the stored credential file, the program is able to automatically retrieve a fresh access token whenever it needs one.
Most importantly, the stored credential file is completely portable, meaning one program can be used to authorize access and then that resulting credential file can be deployed with another program that will run in batch mode on a server someplace. Once I worked this out and demonstrated feasibility, the project was in the bag.
I will not delve too far into the GMail API because it is already well documented by Google. My code example retrieves the message data for each e-mail message to interrogate the from address in the header, which relates to our client’s requirements around messages coming from the Webmasters service.
After the “from” filtering takes place, the program prints out who the message came form and the subject of the email.
It will suffice to mention that it was extremely easy to use the GMail API to retrieve any data I needed from each account and it arrives in an easy-to-consume JSON format. Google obviously uses their own API for the GMail web client. It is entirely possible to write a custom e-mail client with full GMail feature compatibility.
Given the codebase I had to start with, the concept of using OAuth2 with Google APIs was initially daunting as compared to the older API key method that Google had previously supplied. While the new protocol itself is understandable for a single Google account, extrapolating what to do for our client who with 53 Google accounts and so many websites in the mix was where the real work happened. I ended up following the typical task decomposition exercise:
My key take-away from the project is that Google appears to do things for Google’s own sake. For example, Version 3 of the Webmaster API does not include the same capabilities nor the same mechanisms for getting at the data as Version 2. I was creative and found work-arounds, but there was no documented migration path. Google may have some strategic reason for changing and limiting functionality, but it broke our client’s business process, which leads to distrust. I do not sense that Google’s design planned for anyone to be using multiple accounts. The number of websites that can be managed by a single account is 1000, which probably covers 99% of their service consumers, so indeed our client is a rather far outlier. In the end, I solved the problem, but it required perseverance to reach a solution.