Meet the team at RSAC 2024 to Protect Your Top Secrets

Grab a free pass on us

The state of
Secrets Sprawl

3.4K
Occurrences of secrets detected
per AppSec engineer in 2021

The growing problem of secrets sprawling in corporate repositories can only be solved by enabling collaboration between AppSec and Developers.

It’s safe to say that 2021 will go down in history for cybersecurity experts.

Ransomware and other large-scale cyberattacks (SolarWinds, Colonial Pipelines) or vulnerabilities (Log4Shell) have made headlines around the world. Software supply chain attacks have seen their number explode, and this comes as no surprise considering the plethora of vulnerabilities and misconfigurations found across software development environments.

Unsurprisingly, a lot of attacks start with the compromise of a leaked secret. Credentials are a nightmare for security engineers because they can end up in so many places: build, monitoring, or runtime logs, stack traces, and … git history.

The results of our 2022 Secrets Sprawl report comforted our view that the only way to address the challenge of secrets sprawling within corporate repositories is to enable a shared responsibility between AppSec and Devs.

Public Monitoring

Internal Monitoring

Source code is a huge wealth of knowledge. It also happens to exist on pretty much every developer’s workstation, which they probably take home with them. You probably don’t want your secrets being all over the country.

Don, Security engineer

monitoring

on github

56M

users

+25%

repositories created last year

+23%

commits scanned
by GitGuardian

How leaky was 2021?

Over 6M secrets detected in 2021

2x increase compared to 2020

On average, 3 commits out of 1,000 exposed at least one secret,
+50 % compared to 2020

Leaks correlate to popularity

It should come as no surprise that leaks are proportional to user adoption, and this is especially true for newcomers rapidly gaining in popularity.

Supabase, which has consistently ranked in GitHub’s top-20 fastest-growing open-source startups and launched 50,000 databases in 2021 only, is a telling example. Another one is PlanetScale, a serverless database platform released in 2021 Q4, which immediately started appearing on our radars.

Scanning Docker Hub

When it comes to open-source, GitHub certainly is the first platform that comes to mind. Yet it is not the only resource for code-sharing. Since Docker popularized the use of containers to package apps, its official public registry, Docker Hub, has become another developers’ favorite.

It’s therefore not surprising
to find secrets in Docker Hub.

The layers making up Docker images are as many additional attack surfaces that can too easily be left out of the security perimeter. For attackers, it is yet another chance of finding an access vector, just as demonstrated by the Codecov breach.

Docker Hub = 8.8M Docker images
publicly available

This motivated us to conduct our first study on the extent of secrets sprawl in Docker Hub a few months ago. To deepen this first estimation, we reiterated the exercise, this time with a 5x larger sample. Here are our results:

We found more than
500 commit
messages
containing GitHub
personal access tokens!

Where leaks come from

01
India
02
USA
03
Germany
04
France
05
Indonesia
06
Russia
07
Nigeria
08
Bangladesh
09
Brazil
10
UK

monitoring

In 2020, GitGuardian launched Internal Repositories Monitoring for Enterprise. 

Monitoring thousands of repositories in real-time and scanning for the entire history of corporate codebases, we gained a realistic view of the state of application security in the DevOps era.

If there is a single conclusion to be drawn from the data, it is that

the amount of work required for both remediating real-time incidents and investigating leaks detected in the git history (which can still represent a threat) far exceeds current AppSec teams' capabilities.

Secrets detection is a very essential part of security. It’s one of the basics that you need to cover all the time. Otherwise, you’re going to expose your endpoints online and you’re going to suffer endless attacks. When it comes to application development, secrets detection is essential to a security program. You need to have it. Otherwise, you’ll fail.

Abbas Haidar, Head of InfoSec

Security teams are overwhelmed

On average, in 2021, a typical company with 400 developers and 4 AppSec engineers would discover

1,050 unique secrets leaked
upon scanning its repositories and commits. 

With each secret detected in 13 different places on average, the amount of work required for remediation far exceeds current AppSec teams capabilities (1 AppSec engineer for 100 developers).

1 AppSec engineer needs to handle
3,413
secretoccurrences
on average

A false sense of secrecy

Our intuition that private repositories permeate a false sense of secrecy, causing even more leaks to occur compared to public ones, could be confirmed: 

On average, private repositories are 4x more likely to reveal at least one incident. 

Not only private repositories are more likely to be affected, but they also reveal the real magnitude of secrets sprawl. 

I think people are getting more aware of secrets. [...] I think that it has had a positive impact on the culture itself. You’re only as good as the software you write, and you’re in for a world of hurt if you put the keys to the castle inside of that source code that could be somehow reverse-engineered. By separating the two, the source code and the keys, you’re one step ahead of that. I think it’s essential.

Blake, DevSecOps engineer

Developer in the Loop

One core feature introduced in 2021, Developer in the Loop allows security engineers to share an incident with the developer who committed the secret. We are firmly convinced that a Shared Responsibility Model is key to enable application security at scale (security teams own the process, but developers are involved), and just a few months into the release the results already speak for themselves.

Involving the developer results in an incident closing rate
72 % higher
and a Median Time to Remediate
divided by 2

Solving the problem
of secrets sprawl

Incidents detection and remediation can be shifted left at various levels to build a layered defense all across the development cycle.

Here is a progressive approach to move forwards to a “zero secrets-in-code” policy:

Incidents detection and remediation can be shifted left at various levels to build a layered defense all across the development cycle. Here is a progressive approach to move forwards to a “zero secrets-in-code” policy: 

Start by monitoring commits and merge/pull requests in real-time for all your repositories with native VCS or CI integration, where the ultimate threat lies (shift at team level).

Progressively enable pre-receive checks to harden central repositories against leaks, and “stop the bleeding”.

In the meantime, educate about using pre-commit scanning as a seatbelt (shift at developer level).

Plan a longer-term strategy to handle older incidents discovered by the git history scanning.

Implement a Secrets Security Champion program.

By integrating vulnerability scanning into the development workflow, security isn't a bottleneck anymore. You can help developers catch vulnerabilities at the earliest stage and considerably limit remediation costs. This is even more true for secrets detection, which is very sensitive to

sprawling (as soon as a secret enters a version control system, it should be considered compromised and so requires remediation effort). On the other hand, you can reduce the number of secrets entering your VCS by better-educating developers while preserving their workflow.

By integrating vulnerability scanning into the development workflow, security isn't a bottleneck anymore. You can help developers catch vulnerabilities at the earliest stage and considerably limit remediation costs. This is even more true for secrets detection, which is very sensitive to sprawling (as soon as a secret enters a version control system, it should be considered compromised and so requires remediation effort). On the other hand, you can reduce the number of secrets entering your VCS by better-educating developers while preserving their workflow.

Let's conclude

Secrets sprawl is a growing phenomenon, not only because more code is pushed, forked, and shared online every day, but also because the number of building blocks making up an application is increasing (cloud infrastructure, managed databases, SaaS applications, open-source components, internal microservices…).

As the sprawling accelerates, version control systems are quickly becoming a top target for hackers looking to start a supply chain attack, as seen in multiple breaches last year. Compromising hardcoded secrets requires no special skills, and the proliferation of leaked secrets in public GitHub (which more than doubled since 2020) is a red flag for many application security professionals. Public repositories must be therefore included in the safeguarded corporate perimeter.

Secrets sprawl is a growing phenomenon, not only because more code is pushed, forked, and shared online every day, but also because the number of building blocks making up an application is increasing (cloud infrastructure, managed databases, SaaS applications, open-source components, internal microservices…).

As the sprawling accelerates, version control systems are quickly becoming a top target for hackers looking to start a supply chain attack, as seen in multiple breaches last year. Compromising hardcoded secrets requires no special skills, and the proliferation of leaked secrets in public GitHub (which more than doubled since 2020) is a

red flag for many application security professionals.

Public repositories must be therefore included in the safeguarded corporate perimeter.

The situation is also true on the internal side. Private repositories hide a huge amount of (often forgotten) secrets that could one day be used for fraudulent purposes.  Unfortunately, even if they are aware of it,

AppSec are overwhelmed by the amount of work to be done, either to remediate incidents on-the-fly or to dig through the stack of older but still present ones.

There is an urgency to remove secrets from source code, but to do so requires adopting the right mindset: our experience has shown that the only reasonable approach to deliver secure software at scale is to share the application security responsibility between developers, security, and ops. Enabling this model is our mission.

The situation is also true on the internal side. Private repositories hide a huge amount of (often forgotten) secrets that could one day be used for fraudulent purposes. Unfortunately, even if they
are aware of it, AppSec are overwhelmed by the amount of work to be done, either to remediate incidents on-the-fly or to dig through the stack of older but still present ones.

There is an urgency to remove secrets from source code, but to do so requires adopting the right mindset: our experience has shown that the only reasonable approach to deliver secure software at scale is to share the application security responsibility between developers, security, and ops. Enabling this model is our mission.

If you want to learn more about how GitGuardian solutions can help you improve on code security, don’t hesitate to contact us.

About GitGuardian

The new ways of building software create the necessity to support new vulnerabilities and new remediation workflows. These needs have emerged so abruptly that they have given rise to a young and highly fragmented DevSecOps tooling market. Solutions are specialized based on the type of vulnerabilities being addressed: SAST, DAST, IAST, RASP, SCA, Secrets Detection, Container Security, and Infrastructure as Code Security. However, the market is fragmented and tools are not well-integrated into the developers’ workflow.

GitGuardian, founded in 2017 by Jérémy Thomas and Eric Fourrier, has emerged as the leader in secrets detection and is now focused on providing a holistic code security platform while enabling the Shared Responsibility Model of AppSec. The company has raised a $56M total investment to date.

With more than 150K installs, GitGuardian is

the #1 security application
on the github marketplace.

Its enterprise-grade features truly enable AppSec and Development teams in a collaborative manner to deliver a secret-free code. Its detection engine is based on %ndet% detectors able to catch secrets in both public and private repositories and containers at every step of the CI/CD pipeline.