Mar 16, 2021

Google Professional Machine Learning Engineer Exam 2021

Exam Description

A Professional Machine Learning Engineer designs, builds, and productionizes ML models to solve business challenges using Google Cloud technologies and knowledge of proven ML models and techniques. The ML Engineer is proficient in all aspects of model architecture, data pipeline interaction, and metrics interpretation and needs familiarity with application development, infrastructure management, data engineering, and security.

The exam is 60 questions and you have two hours to complete it. I found that I had plenty of time to read and answer the questions. I typically struggle with timed/standardized testing too. Be sure to give yourself time to review your answers. I personally marked questions for review if I wasn’t 100% sure about it. It allowed me to go back and rethink through the problem. Remember that in Google Exams pretty much all the answers given are possible and acceptable solutions to the problem. You are typically looking for the fastest solution, unless prompted otherwise. Depending on your understanding of ML only having an introduction to GCP ML offerings should suffice. But it is good to familiarize yourself with the tools that GCP uses. Having more experience is better and will most likely experience better results. Also I personally like taking the exams at testing centers. I would not recommend the online proctored method if you don’t have a space where literally no one can distract you. Also the online proctor can stop you if they think you may be trying to cheat or record the exam.

My goal with this blog post is to provide a collection of the tools, courses, experience, and information that helped me pass the exam. I would recommend using this guide and filling information as you go through the Qwiklabs. There won’t be any exam questions provided but there will be hints on what to study that will help with exam questions. Here are a couple hints to always keep in mind:

  1. Unless the question specifically asks for the most performant solution, choose the solution that solves the problem the fastest and the simplest. Be sure to meet all the requirements of the question though!
  2. The following is the order to consider solutions starting with the quickest to implement.
    ML APIs -> Auto ML -> BigQuery ML -> AI Platform Built-in algorithms -> AI Platform Training
    It is good to know each of their limitations and capabilities
  3. Typically one of the right selections will be repeated in the wrong answers
  4. Make sure to do the 10 sample questions that Google will make available to you. I didn’t run into them on the test but they do give you useful information that may relate to other questions.
  5. I would definitely recommend purchasing Machine Learning Design Patterns. Not only is it helpful for the exam but it will help you in implementing ML systems in the future!
  6. In the exam, watch for indicators like real-time or batch, performant or quickly implement, and no-code. These will help guide your thought process on what solutions to pick for the situation based problems.
  7. I would recommend having a decent understanding of CNNs, RNNs, and Recommendation System Types and when to use them. Understand their pitfalls and advantages. The Qwiklab should provide all the information you need.

The Professional Machine Learning Engineer exam assesses your ability to:

  • Architect ML solutions
  • Prepare and process data
  • Develop ML models
  • Automate & orchestrate ML pipelines
  • Monitor, optimize, and maintain ML solutions

Exam Guide

Section 1: ML Problem Framing

  • Defining business problems
  • Identifying nonML solutions
    • Definitely had a question related to this so be sure you know when using ML is appropriate! At the end there will be a list of flowcharts that will explain when to use ML.
  • Defining output use
  • Managing incorrect results
    • How to fix your model based on the results and model performance
  • Identifying data sources
  • Defining problem type (classification, regression, clustering, etc.)
  • Defining outcome of model predictions
  • Defining the input (features) and predicted output format
  • Success metrics
  • Key results
  • Determination of when a model is deemed unsuccessful
  • Assessing and communicating business impact
  • Assessing ML solution readiness
  • Assessing data readiness
    • Training/Serving Skew
  • Aligning with Google AI principles and practices (e.g. different biases)
    • Know the difference between Sampled Shaping, Integrated Gradients, and XRAI.
    • You should also know what they are good at and what they are bad at.

Section 2: ML Solution Architecture

  • Optimizing data use and storage
  • Data connections
  • Automation of data preparation and model training/deployment
  • SDLC best practices
  • A variety of component types – data collection; data management
  • Exploration/analysis
  • Feature engineering
    • numerical
    • categorical encoded/embedded/hashed
    • bucketized encoded/embedded/hashed
    • Crossed
      • Know when to do this properly
  • Logging/management
  • Automation
  • Monitoring
  • Serving
    • AI Platform Prediction vs KubeFlow Pipelines
  • Selection of quotas and compute/accelerators with components
  • Building secure ML systems
  • Privacy implications of data usage
  • Identifying potential regulatory issues
  • Google ML APIs
    • Know the capabilities and when to use them and what conditions are certain things better optimized for certain scenarios
    • Know how to use and troubleshoot the Dialogflow and Recommendations AI.
    • Make sure to use Data Labeling Services if you need to label your data.

Section 3: Data Preparation and Processing

  • Ingestion of various file types (e.g. Csv, json, img, parquet or databases, Hadoop/Spark)
  • Database migration
  • Streaming data (e.g. from IoT devices)
  • Visualization
    • Know when to use BigQuery to create aggregations for plotting in AI Platform.
    • Also know when to use Cloud Data Studio
  • Statistical fundamentals at scale
  • Evaluation of data quality and feasibility
  • Batching and streaming data pipelines at scale
    • Cloud Dataflow
    • Cloud Dataflow
    • Cloud Dataflow
    • I can’t stress the importance of knowing the ins and outs of Cloud Dataflow.
    • Cloud Dataproc if you have existing on-prem Hadoop/Spark jobs and data is loaded in Cloud Storage
  • Data privacy and compliance
  • Monitoring/changing deployed pipelines
  • Data validation
  • Handling missing data
    • How to properly impute missing data
      • Numerical and Categorical Data
  • Handling outliers
    • Understand Normalization and when you should or shouldn’t remove outliers
  • Managing large samples (TFRecords)
  • Transformations (TensorFlow Transform)
  • Data leakage and augmentation
    • Understand when Data leakage becomes a problem and what typically happens to models when it occurs.
  • Encoding structured data types
  • Feature selection
    • PCA
    • Also attributes of what makes a good feature
  • Class imbalance
  • Feature crosses

Section 4: ML Model Development

  • Choice of framework and model
  • Modeling techniques given interpretability requirements
  • Transfer learning <- Know when to use this!!!
  • Model generalization
    • Understand what this is and what it does
  • Overfitting
    • Know when you should use overfitting for model analysis
  • Productionizing
  • Training a model as a job in different environments
  • Hyperparameter Training
    • Google Vizier
  • Tracking metrics during training
    • Tensorboard
  • Retraining/redeployment evaluation
  • Unit tests for model training and serving
    • Know when/where to do this and what the unit test does
  • Model performance against baselines, simpler models, and across the time dimension
    • Understand Regularization Techniques
    • Understand Bias/Variance Trade-Off
    • Understand Learning Rate & Batch Size Trade Offs
      • Model Impact
      • Compute/Memory Impact
  • Model explainability on Cloud AI Platform
    • Know how to use and interpret these tools
      • TensorFlow Data Validation and Facets
      • Tensorboard What-If-Tool
  • Distributed training
    • Know the different types of distributed training and their components
      • Data Parallelism
      • Model Parallelism
    • Remember AI Platform built-in algorithms can’t use distributed training
  • Hardware accelerators
    • TPU vs GPU
      • Remember TPU’s aren’t always the best option
      • TPU’s are typically meant for fast batch prediction
  • Scalable model analysis (e.g. Cloud Storage output files, Dataflow, BigQuery, Google Data Studio)

Section 5: ML Pipeline Automation & Orchestration

  • Identification of components, parameters, triggers, and compute needs
  • Orchestration framework <- Cloud Composer / Airflow
  • Hybrid or multi-cloud strategies
  • BigQuery ML Capabilities and Limitations
    • BigQuery Struct Type
  • Decoupling components with Cloud Build
  • Constructing and testing of parameterized pipeline definition in SDK
  • Tuning compute performance
    • Know for Serving, Training, and Input Pipelines
  • Performing data validation
    • Tensorflow Data Validation
      • Data Drift
  • Storing data and generated artifacts
  • Model binary options
    • Know what are the most optimal solutions and understand the trade-offs on serving the binaries from different options
  • Google Cloud serving options
    • KubeFlow Pipelines
    • AI Platform
  • Testing for target performance
    • AUC ROC
      • Understand why AUC is typically better
    • Precision vs Recall
    • F1 Score
    • Confusion Matrix
  • Setup of trigger and pipeline schedule
    • Cloud Function vs Cloud App Engine vs Cloud DataFlow
  • Organization and tracking experiments and pipeline runs
    • KubeFlow Pipelines
  • Hooking into model and dataset versioning
  • Model/dataset lineage
  • Hooking modes into existing CI/CD deployment system
    • Typically any CI/CD requirement Cloud Build isn’t a bad idea to consider
  • AB and Canary testing

Section 6: ML Solution Monitoring, Optimization, and Maintenance

  • Performance and business quality of ML model predictions
  • Logging strategies
  • Establishing continuous evaluation metrics
  • Permission issues (IAM)
  • Common training and serving errors (TensorFlow)
  • Troubleshooting Deep Learning VMs
  • ML system failure and biases
  • Optimization and simplification of input pipeline for training
  • Simplification techniques
  • Identification of appropriate retraining policy

Helpful Study Materials

Other Great Labs To Do:

Hands On Training
Get extra practice with Google Cloud through self-paced exercises covering a single topic or theme offered via Qwiklabs.

Here is a list of links of Google Docs that were extremely helpful for me!

Thanks for reading and please leave any feedback! I hope to update the blog in the near future with my notecards that I used to study for the exam.

Also if you are interested in having Object Partners Inc., an Improving Company, come help on your ML/AI Project feel free to reach out on Twitter, LinkedIn, or Email! We have expertise in GCP, AWS, and Azure! Also we have the rare AWS status of Black Belt Certified in ML and AI.

Bonus Charts

Thanks to Sara Robinson for Tweeting this!

About the Author

Scott Poulin profile.

Scott Poulin

Sr. Consultant
Leave a Reply

Your email address will not be published. Required fields are marked *

Related Blog Posts
Designing Kubernetes Controllers
There has been some excellent online discussion lately around Kubernetes controllers, highlighted by an excellent Speakerdeck presentation assembled by Tim Hockin. What I’d like to do in this post is explore some of the implications […]
React Server Components
The React Team recently announced new work they are doing on React Server Components, a new way of rendering React components. The goal is to create smaller bundle sizes, speed up render time, and prevent […]
Jolt custom java transform
Jolt is a JSON to JSON transformation library where the transform is defined in JSON. It’s really good at reorganizing the json data and massaging it into the output JSON you need. Sometimes, you just […]
Page Object Model for UI Testing