On January 4th, CircleCI reported that a malicious actor had breached the company’s internal systems, resulting in the exposure of thousands of customer secrets stored on the CircleCI CI/CD platform. These secrets included cloud access credentials like AWS access keys, environment secrets such as API tokens and database credentials, and more. CircleCI advised their customers to immediately rotate all secrets stored within the platform.
At Gem, some of our customers used CircleCI as their production build tool, and were affected by the breach.
CircleCI Response Uncut: Typical Approach to Key Rotation
Problem 1: We do this by hand?
In the aftermath of the announcement, many security teams immediately went all-hands-on-deck. In a blog post, CircleCI explained how key rotation must take place. Without tooling, much of the response was manual.
The CircleCI platform includes 7 types of secrets: OAuth tokens, project API tokens, project environment variables, context variables, user API keys, project SSH keys, and CircleCI runner tokens. Security teams got assistance from vendors in rotating some of these, but environment variables, context variables, SSH keys, and runner tokens all needed to be rotated manually. For each of these, security teams had to click through each page in CircleCI, extract all keys into a centralized table or Excel file, then rotate the keys on the relevant platform (i.e., AWS, Azure, GCP, or other SaaS or PaaS tools).
Figure 1: The standard approach to the CircleCI breach response requires manual tracking of all individual secrets stored in the platform
Enterprise teams typically stored hundreds or thousands of keys in CircleCI, making this process long and manual. In the hours or days it took to rotate these keys, organizations remained exposed.
Plus, this process wasn’t actually guaranteed to capture all the exposed keys. If keys had been deleted from the CircleCI platform after December 21st but hadn’t been rotated, active, compromised keys might not show up in the organization’s list of credentials to rotate, leaving companies at risk. Companies might address this by searching for keys in the platforms themselves - i.e., searching for the string “circle” in AWS IAM. But even this process relied on consistent naming procedures being in place, and there was still a chance that some keys might have been missed.
Problem 2: So have I been breached or not?
Though CircleCI became aware of the breach and notified customers on January 4, 2023, the initial access occurred on December 21, 2022. Even assuming companies were able to rotate all keys immediately after the announcement, that still left a window of about two weeks where an attacker might have had access to a customer environment.
So how should an organization go about determining whether that access was actually exploited? CircleCI recommended customers review their systems logs for any unauthorized access over that period. But especially in the cloud, it was not obvious where exactly customers should look nor what they should be looking for. Organizations needed to know whether to take additional action or send breach notifications, but there was no easy way for them to find out.
The Gem Approach
It was clear that response based on the static secrets inventory in CircleCI left organizations exposed and at risk. At Gem, we took a different approach, using real-time and retroactive activity data to understand an organization’s exposure.
Step 1: See What CircleCI Keys are Used
CircleCI’s disclosures enabled Gem to construct a new TTP: Wherever the IP Ranges feature was enabled, we looked for any access to cloud IaaS platforms from CircleCI’s IP ranges. We added this TTP to our detection engine, which automatically searched for it both in real-time and retroactively in the environment. Using an activity-based approach captures all keys that were actually in use during the at-issue time period, even if some of them had been deleted from the CircleCI platform, misnamed, or were otherwise missed in the manual pass.
Figure 2: Gem's activity-based approach automatically locates all keys associated with CircleCI by checking for access from CircleCI IP ranges
The activity-based approach also extends beyond IaaS access keys, and we can use it to monitor cloud-native PaaS platforms as well. For example, we also added TTPs to detect repeated access from CircleCI IP ranges to RDS instances. This tips us off that the RDS access keys may be stored in CircleCI.
Step 2: Assess Impact
Using activity logs also enables us to answer the question on everyone’s mind: am I breached, or not? Once we’ve gathered a list of potentially compromised keys, we employ behavioral analytics to understand how they were used in the environment and determine whether they were used maliciously.
Imagine a given credential in CircleCI is used for daily builds at midnight UTC. If we identified that that credential was compromised, but the only action it was used for in the environment was consistent, daily building at the same time, it’s extremely unlikely that the key was actually used maliciously. On the flip side, if the key was suddenly used to create virtual machines in a new region or access sensitive data from new IP ranges or countries, the opposite is likely true and the security team needs to be notified immediately. In between these two examples are shades of gray, but an activity-based approach allows for a systematic classification of risk.
Figure 3: Once keys are located, Gem conducts an automatic retroactive threat-hunt to detect any signs that the keys were used maliciously and that the organization may be breached
A systematic, activity-based approach to the CircleCI breach has three main benefits for security teams:
Response efficiency is increased. Security teams can know what keys are compromised immediately, rather than manually scraping data from CircleCI or running numerous queries in a SIEM
Overall risk is reduced. Using activity data captures all keys in use, even if naming conventions are not adhered to or if active keys have been deleted from the CircleCI platform. Organizations can be certain they’ve captured everything.
Breach status can be determined immediately. Behavioral analytics enables organizations to determine if they were breached or not, and respond accordingly.
Figure 4: Gem enables effortless indexing and search for all credentials affected by the CircleCI breach
Our unique approach at Gem enabled our customers to find the compromised credentials and act. Gem automatically sifts through the logs so our customers don’t have to, and efficiently locate compromised keys to reduce risk and prevent future breaches.