Simple GCP Instance Scheduler (w/ Terraform, Cloud Function)

So I’m starting to pick up on this pattern that if I don’t do enough Terraform/CloudSec automation/Python/Go at work, I’ll end up doing it over the weekend. And if I’m getting my “fix”, I guess, I stop posting article for long periods of time….is this that thing they call balance? I’m going to try to keep this one pretty short and sweet.

Today I bring you, a simple GCP Instance Scheduler! This simple “terraform apply” to your project will turn off your instances based on a cron expression set GCP Cloud Scheduler and a label of your choice. Simply edit the variables.tf with your label, label your instances in scope, set a cron job, apply, and you’re set. Perfect for dev instances maybe?

Here is the repo with all the Terraform and Python:

And here is the breakdown of how it all works (4 main components):

Variables.tf

  • Project: Enter GCP Project here
  • Cron Pattern: Enter cron expression here. The default is set for everyday at 6pm run the job.
  • Label Key: Enter label key. The default is set to “instance-scheduler”
  • Label: Value: Enter label value: The default is set to “enabled”
  • Scheduler Function Bucket: Enter bucket name (must be globally unique). This will store the zip file necessary for the Cloud Function. What is that cloud function going to run on your laptop every time its called? I think not!

Main.tf

  • Cloud Scheduler: This will be where your cron job expression (set in variables.tf) will run.
  • PubSub Topic: Cloud scheduler will trigger based on the expression and will send a message (content of message doesn’t matter for this) to the PubSub topic.
  • Cloud Function: A cloud function will be “subscribed” to the topic. When the topic “publishes” the message sent to it, the function will be invoked. The function will get list of current zones and iterates through each zone looking for an instance that matches the filter. The filter is looking for running instances with matching label key and label value. If an instance matches this filter, it is shut down.
  • IAM Role: In order the the cloud function to have the ability to shut down an instance, it needs permission for the following:
  1. To be able to see all the zones (compute.zones.list)
  2. To be able to see all the instances (compute.instances.list)
  3. To be able to stop an instance (compute.instances.stop)
  • And since best practices dictates to use “least privileged”, it’s best to use a custom role here. The other roles like Computer Admin or Instance Admin are too powerful for this simple little shutdown tool. The terraform in the repo also creates the necessary custom role , service account, and adds the service account as a member to the project with the three specific permissions listed above.

Function.zip (two files — main.py and requirements.txt)

  • main.py: There are three functions in this file described below:
  1. Gather_Zones — since this list changes as Cloud Providers expand.
  2. Turn_Instance_Off — to make the API call to turn off the instance when it matches the filter.
  3. Instance_Scheduler_Start — to coordinate it all/handle environment variables.
  • requirements.txt: This file list the two libraries needed for the python logic the work.

When I’m done with these two files, I zip them up and make sure the zip filename matches the one referenced in the Cloud Function’s storage archive source (main.tf).

Lastly

Here is the log result after successfully running this:

I had two instances up, one was labeled (based on the variables.tf) and the other wasn’t. It successfully shut down the correct instance. Other than making sure the IAM API is enabled and that the account you are using to provision all the resources has access to do so, that should cover it! Clone the repo, terraform init and apply. Cheers!