Apr 25, 2024

Avoiding Secret Leaks in Python Projects

By Andreas Bergman

In 2022, it was reported that approximately 10 million secrets were unintentionally exposed in GitHub commits. This type of exposure can have serious consequences, such as data breaches and financial losses. So, how did this happen, and more importantly, what steps can you take to reduce the risk of your secrets being leaked?

Preventing leakage

Secrets such as passwords, API keys, and encryption tokens can be inadvertently leaked through a variety of means, including hard-coding in source code repositories, insufficient access controls on storage systems, insecure transmission over networks, and inadequate logging practices that inadvertently capture sensitive information. Here are some ways you can avoid leaking secrets.

Use `.gitignore` and avoid `git add *`

To prevent accidental inclusion of sensitive files in version control, use specific commands like git add [file] instead of git add *. Employ .gitignore to automatically exclude files and directories that should not be committed. GitHub provides a number of default .gitignore files here, which can provide a solid starting point.

Redact secrets

In order to prevent the inadvertent disclosure of sensitive information through loggers, it is recommended to utilize redacting tools. Pydantic’s SecretStr and SecretBytes are effective as they conceal the content of variables unless specifically instructed to reveal it. This practice can assist in avoiding unintentionally logging confidential information in our system logs.

Implement credential scanning tools

Incorporate tools such as Trufflehog and Gitleaks in your CI/CD pipelines. These tools scan repositories and commit histories to detect exposed secrets, helping you remediate issues before they escalate. This can help you catch these leaks early and limit the damage that can occur.

Note

Please be aware that while there are tools available to scan your repositories for sensitive information, hackers also use various tools such as scrapers, spiders, and trackers to identify leaked information. It’s important to note that some of these tools are able to detect leaks shortly after they have been committed to a public repository. This means that even if the files do not end up being pushed, the secrets can still be exposed, unless the commit exposing the secret is modified.

Never transmit secrets through plaintext channels like email or chat applications. Remember, their data can also be leaked. Many leaks come via backchannels that may not be directly linked to your project, or even access to your PC. Use encrypted methods or password managers like Bitwarden for secure sharing.

Avoid hardcoding secrets

Never hardcode secrets into your code. Instead, make use of environment variables, the Keyring module, and if you are hosting online, AWS and GCP and Bitwarden all provide a Secrets Manager.

Adhere to the least privilege principle

Implementing strict boundaries around sensitive information can prevent leaks and mitigate potential damage if leaks occur. Access tokens and keys should have narrowly defined scopes and limits placed on their budgets and durations. These should be granted only to individuals who need them to perform their duties. This not only enhances security but also simplifies auditing and compliance by making it easier to track who has access to what and for how long.

Regularly audit code and processes

Make sure that security measures, code and repositories are regularly audited. This includes scanning for leaks and checking project dependencies and CI/CD pipelines for vulnerabilities. There are a number of tools for this such as Bandit and SafetyCLI.

Managing credentials

Use unique passwords

It’s highly important that you keep different passwords for different accounts, as the leakage of one may compromise others via the practice of “credential stuffing”. This is where a user uses one username and password across multiple accounts, and hackers use said credentials across multiple services to see which ones they can access. This is a common practice due to how often users recycle their credentials.

Note

Password managers can make this really easy, by generating long, difficult passwords for many accounts which can be locked with a memorable password.

Use multi-factor authentication

Secure your code repositories and cloud environments with multi-factor authentication to safeguard against unauthorized access. Passkeys and authenticator applications are great ways to make sure your accounts stay secure. Be wary of some secondary authentication methods, as they can be intercepted and hijacked.

Final thoughts

Many data leaks are not caused by technical issues such as unencrypted messages or poorly structured logs. Instead, a significant number of leaks result from social engineering tactics, such as phishing attempts and credential stuffing. With the rise of AI and deep fakes, it is also important to be cautious of messages and videos from colleagues.

Therefore, it is essential to approach security measures with a skeptical mindset and take necessary precautions. These steps may include verifying requests through a secondary channel, promptly reporting any suspicious activities, and establishing clear communication guidelines regarding data sharing on different platforms and with specific individuals. Additionally, it is crucial to implement robust policies for creating and managing tokens, credentials, and access keys.

Keeping secrets can be tough, especially in a world that tries so hard to reveal them, but by following these guidelines, you can help ensure your secrets stay a little safer.

Have fun, stay skeptical, and be safe!

Pi Day Sale! Use coupon PITHON2025 to get 14.3% off all courses. 🥳

Avoiding Secret Leaks in Python Projects

Preventing leakage

Use `.gitignore` and avoid `git add *`

Redact secrets

Implement credential scanning tools

Avoid hardcoding secrets

Adhere to the least privilege principle

Regularly audit code and processes

Managing credentials

Use unique passwords

Use multi-factor authentication

Final thoughts

Improve your code with my 3-part code diagnosis framework

Recent posts

Effective Use of Design Patterns in Modern Software Design

Master Python Descriptors for Property and Attribute Access

Fixing Common Syntactic Snafus and Pitfalls in Python

Pi Day Sale! Use coupon PITHON2025 to get 14.3% off all courses. 🥳

Avoiding Secret Leaks in Python Projects

Preventing leakage

Use .gitignore and avoid git add *

Redact secrets

Implement credential scanning tools

Secure sharing of secrets

Avoid hardcoding secrets

Adhere to the least privilege principle

Regularly audit code and processes

Managing credentials

Use unique passwords

Use multi-factor authentication

Final thoughts

Improve your code with my 3-part code diagnosis framework

Recent posts

Effective Use of Design Patterns in Modern Software Design

Master Python Descriptors for Property and Attribute Access

Fixing Common Syntactic Snafus and Pitfalls in Python

Use `.gitignore` and avoid `git add *`