Rewriting files in Google Cloud Storage

Rewriting Files in GCP

Note: even though this code is in Python, this should be the same idea in JavaScript, Go, etc.

I wrote the following to copy a file from one Google Cloud Storage bucket to another:

src_blob = src_bucket.blob(file_name)
dest_blob = src_bucket.copy_blob(src_blob, dest_bucket, new_name=new_name)

But for the bigger files (around 120MB or so) I got the following:

Copy spanning locations and/or storage classes could not complete within 30 seconds. Please use the Rewrite method (https://cloud.google.com/storage/docs/json_api/v1/objects/rewrite) instead.

I noted that copy_blob has a timeout parameter, so why not try that?

src_blob = src_bucket.blob(file_name)
dest_blob = src_bucket.copy_blob(src_blob, dest_bucket, new_name=new_name, timeout=180)

And… same error:

Copy spanning locations and/or storage classes could not complete within 30 seconds. Please use the Rewrite method (https://cloud.google.com/storage/docs/json_api/v1/objects/rewrite) instead.

Note that it still says 30 seconds, so it totally ignored my timeout parameter. Looking at the rewrite docs on the link I note that it is just for the ran JSON API, not for Python like I was using. Some digging and StackOverflow reading, I came up with this snippet:

  src_blob = src_bucket.blob(file_name)

  dest_blob = dest_bucket.blob(file_name)
  rewrite_token = False

  while True:
      rewrite_token, bytes_rewritten, bytes_to_rewrite = dest_blob.rewrite(
            src_blob, token=rewrite_token
        )
      print(
            f"\t{new_name}: Progress so far: {bytes_rewritten}/{bytes_to_rewrite} bytes."
        )
      if not rewrite_token:
            break

That will print out each write to the files… and with my 120MB files, there was only one write. Overall I found this faster than copy_blob even for the small files.

About the Author

Mike Hostetler profile.

Mike Hostetler

Principal Technologist

Mike has almost 20 years of experience in technology. He started in networking and Unix administration, and grew into technical support and QA testing. But he has always done some development on the side and decided a few years ago to pursue it full-time. His history of working with users gives Mike a unique perspective on writing software.

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Blog Posts
React Server Components
The React Team recently announced new work they are doing on React Server Components, a new way of rendering React components. The goal is to create smaller bundle sizes, speed up render time, and prevent […]
Jolt custom java transform
Jolt is a JSON to JSON transformation library where the transform is defined in JSON. It’s really good at reorganizing the json data and massaging it into the output JSON you need. Sometimes, you just […]
Page Object Model for UI Testing
How to make a multi-handled range input
The HTML range input is a great way to allow your users to manipulate a numeric value using their mouse. The range input that is currently provided by browsers only supports a single handle, which […]