这是一个安全complia的核心方面nce. In the past few years CircleCI has gone throughFedRAMP certificationandSOC 2合规性, both of which paired us with auditors who needed the details of our vulnerability management processes.
While vulnerability management is a foundational security practice that is absolutely essential, it is also generally tedious and repetitive. This makes it a prime candidate for automation and optimization. Building processes stringent enough to satisfy auditors without accruing massive overhead for our team and our engineering organization posed an interesting challenge.
In this post, I’ll walk you through the vulnerability management process we developed, built to satisfy auditors and align with our team’s use of CI/CD, Docker, and Kubernetes.
Installing a vulnerability scanner doesn’t secure anything on its own. As with all security tools, in order to get any benefits, the tools need to be applied our systems.
When we started this process, we knew our plan would look something like this:
- Get an idea of how bad things actually were and get some high priority patches in-flight.
- Revisit the tooling and building automation around ticket generation and then reporting.
- Find a way to help teams optimize and tune their patching processes.
We ran .csv exports into spreadsheets, added formulas for summaries and some simple AppScript code to do a bit of data cleanup and reconciliation. We were looking to understand the different images and containers that were currently running in our live system, what vulnerabilities those had, and what patching cadence existed; just what was already there, before making any tickets or talking to any development teams. Which services had official owners, and which would we have to go find owners for? What was the shape of the data from the tooling? It was important for us to look at this data manually in a spreadsheet first, so that when we automated it, we’d understand the data coming out of our scanner.
This step provided a lot of details that gave us an initial view into some of the inconsistencies and complexities in the data. Another important set of information was the number of images and containers that didn’t have any official team-owners listed.
Then it was time to assign tickets, focusing on images with critical vulnerabilities first. We generated the first round of tickets by hand and assigned them to teams. Because some of the images didn’t have owners listed, we made a few educated guesses and then watched if those tickets were moved to other teams.
Another set of conversations that happened at this stage was with our auditors. After rolling out the first draft of our policy, we realized our patching timeframes were far too aggressive (turns out the auditors had thought this but didn’t push back). Development teams couldn’t keep up. We realized it was better to have a reasonable policy and hit it than be too aggressive and constantly make exceptions. We worked with our auditors to update the policy to a realistic level.. This was an important lesson for us about the level of maturity our processes were at. We could tighten up time frames later, once we had repeatable processes.
Divide and automate the workload with CI/CD
In traditional server-based infrastructure, the testing environment is separate from the production environment and those require independent patching and much more task-based overhead to discover whether any particular patches cause problems.
Because Docker images contain their software dependencies in a clearly defined unit of deployment, it becomes straightforward to scan those in CI/CD pipelines. With automated testing that includes the software dependencies, patches can be quickly tried and validated using existing tests.
通过将它集成到CI / CD管道中，漏洞修补不会成为我们每个月所做的大事。它只是烤制送到我们发货的方式。
对于那些计划在您的团队中采用此计划，知道这将首先引起痛苦。即使您的团队在CI / CD中很好，您也要在您的构建管道中添加步骤，它会导致建立违反否则会通过的，人们会沮丧。你必须与他们一起工作以回到绿色。这没关系，但为此做好准备。任何未修补的系统都会暂时修补，显着更加清除时间，因此确保在您进行初始修补时与团队和管理协调。向服务组滚动，并准备好为短时间内提供特定服务的临时例外或禁用执行，而那些初始修补正在发生。
As game-changing as CI/CD integration is, alone it is not sufficient for vulnerability management. It does a great job of catching vulnerabilities caused by new deploys, but one of the challenges with vulnerabilities is they’re discovered all the time in existing software. A vulnerability could surface today that affects the code that passed all tests and went into production yesterday. CI/CD is only one half of the puzzle; production environment scanning is the other half.
Put another way, for efficiency in interfacing with teams, integrating with CI/CD can’t be beat. But for seeing the full picture of your vulnerabilities, production scanning is required.
… but not for long. Spreadsheets are great for getting an overview of your data, but if you want to start generating tickets and reporting into how your vulnerability management is going, you are going to need more analysis than fits comfortably in a spreadsheet. So we built an integration pipeline that automated the biggest manual work for the security team. This meant taking data out of the production vulnerability scanner and putting it into tickets. As I emphasized before, it’s a really good idea for security teams to meet engineering teams in the tools they are already using on a daily basis, so getting our program to turn data into tickets was essential.
The integration sounds super simple at first glance: Take data out of your prod vulnerability scanner, transform it, and make tickets. Famous last words.
There were a number of challenges to keep in mind with this integration:
We needed to update existing tickets so that open tickets were not duplicated.我们在JIRA中添加了一个奢华MGMT-ID，以跟踪图像/票证身份映射
Both the source and destination APIs changed over time, quite significantly.要处理此操作，我们将在我们与JIRA和扫描工具中集成的API周围构建抽象层。
该服务必须将所有不同的发行版，库和工具的奇怪数据标准化为一致的东西。This ran from normalizing ‘moderate’ vs ‘medium’ (easy), to every vendor having a different format for “fixed in version” (harder).
This took quite a bit of work, but resulted in a simple command line application that could read the latest information, generate detailed tickets, and assign them to teams in a matter of minutes.
Reporting on production
Since vulnerability management for Docker differs from traditional patching, we had to work with our auditors so they could understand how we were assigning work and make use of the reports we were generating. Traditionally, sysadmin teams would be the ones doing patching, and that’s what our auditors were used to seeing. It took some work to explain that, in our case, every engineering team across the org would be doing patching. After some back and forth, we added some additional details to our report and were able to design one report format that met the needs of our FedRAMP auditors as well as our internal stakeholders.
- Soft-version specification is done by simply specifying a major version or major + minor version of packages and letting the package manager automatically update to the latest versions as part of a scheduled Docker image build. If that scheduled build runs weekly, and removes much of the human intervention to ongoing patching. There are some operational risks associated with this kind of automated version bumping, so if this level of automation is started, it is important to have thorough integration tests and to try it out on less critical services first.
Tips for vulnerability management
（或者，如何使用审计师，Docker和CI / CD系统）
- Engage with engineering teams using the tools and processes they already have in place.
- Don’t be too dogmatic at first.
- 构建修补到现有的CI / CD。
- Invest in automation of vulnerability management.