在软件包,软件库,操作系统和基础架构中不断发现漏洞。漏洞管理是正在进行的扫描,分类,优先顺序和修补软件漏洞的过程。所有现代技术堆栈现在都需要这个周期性维护和更新,以便保持稳定和安全。

这是一个安全complia的核心方面nce. In the past few years CircleCI has gone throughFedRAMP certificationandSOC 2合规性, both of which paired us with auditors who needed the details of our vulnerability management processes.

While vulnerability management is a foundational security practice that is absolutely essential, it is also generally tedious and repetitive. This makes it a prime candidate for automation and optimization. Building processes stringent enough to satisfy auditors without accruing massive overhead for our team and our engineering organization posed an interesting challenge.

在更谨慎的情况下,通过这一挑战工作是Docker终于点击了我的DevSecops好处。

In this post, I’ll walk you through the vulnerability management process we developed, built to satisfy auditors and align with our team’s use of CI/CD, Docker, and Kubernetes.

用码头技术堆栈修补

一点背景:Circleci的堆栈基于Docker。已经有许多文章讨论了如何与传统的基于服务器的体系结构不同,以及如何影响Devops和开发。它也影响安全性,特别是对于此帖子,漏洞管理。

Docker图像是不可变的,包含自己的基础架构包和库,这意味着您无法在传统的“将修补程序应用于服务器”的传统意义上“修补”一个。相反,创建和部署具有更新依赖项的新图像,并且已退出旧图像。

这是第一个推移,修补程序将导致图像所有者和SRE团队需要了解并能够处理的更多部署。我们有利于使用我们现有和稳定的管道进行连续部署,这意味着额外的部署对我们的运营影响最小。

其次,Docker可以拓展更多工程组织修补的维护成本。在传统的基于服务器的环境中,管理器或SRES维护服务器并部署修补程序。在Circleci,开发团队负责维护其Docker映像,因此也负责将安全补丁应用于它们。某些开发团队可能需要适应与他们使用的Docker图像相关的安全维护工作。

第三,许多合规专业人员和审计师对该技术堆栈的工作原因和漏洞管理的后果并不深表熟悉。留出时间与他们一起工作,了解这些工具如何影响修补过程。

勾勒一个计划

Installing a vulnerability scanner doesn’t secure anything on its own. As with all security tools, in order to get any benefits, the tools need to be applied our systems.

When we started this process, we knew our plan would look something like this:

  1. Get an idea of how bad things actually were and get some high priority patches in-flight.
  2. 开始为拥有Docker图像的团队提供更紧密的反馈循环。
  3. Revisit the tooling and building automation around ticket generation and then reporting.
  4. Find a way to help teams optimize and tune their patching processes.

将整个努力提供信息的中央原则是将反馈提供给工程团队,使用他们已经使用的工具和接口每天使用:比使用SCRACK,提醒,电子邮件,会议和报告的分心方法更有效的方法。

通用起点:电子表格

我们需要做的第一件事就是在项目的规模和范围上处理句柄,并能够对原始数据进行排序和过滤以了解我们的情况。

所以我们在同一个地方开始了许多项目开始:电子表格。

We ran .csv exports into spreadsheets, added formulas for summaries and some simple AppScript code to do a bit of data cleanup and reconciliation. We were looking to understand the different images and containers that were currently running in our live system, what vulnerabilities those had, and what patching cadence existed; just what was already there, before making any tickets or talking to any development teams. Which services had official owners, and which would we have to go find owners for? What was the shape of the data from the tooling? It was important for us to look at this data manually in a spreadsheet first, so that when we automated it, we’d understand the data coming out of our scanner.

This step provided a lot of details that gave us an initial view into some of the inconsistencies and complexities in the data. Another important set of information was the number of images and containers that didn’t have any official team-owners listed.

Then it was time to assign tickets, focusing on images with critical vulnerabilities first. We generated the first round of tickets by hand and assigned them to teams. Because some of the images didn’t have owners listed, we made a few educated guesses and then watched if those tickets were moved to other teams.

收到这些门票的许多团队不习惯从其他团队中获取门票,从安全团队那里少得多。我们在第一部开发团队获得了一些推动力,以前没有任务过Docker图像维护,他们担心可能需要多长时间。我们与这些团队合作,帮助获得部署的初始修补程序,并给了我们有机会讨论修补的重要性,以及如何成为更常规的事情。

Another set of conversations that happened at this stage was with our auditors. After rolling out the first draft of our policy, we realized our patching timeframes were far too aggressive (turns out the auditors had thought this but didn’t push back). Development teams couldn’t keep up. We realized it was better to have a reasonable policy and hit it than be too aggressive and constantly make exceptions. We worked with our auditors to update the policy to a realistic level.. This was an important lesson for us about the level of maturity our processes were at. We could tighten up time frames later, once we had repeatable processes.

我们在这个阶段学到的一件事是为了避免完美。如果特定的团队在水下深水中,请不要阻止自己,在几个团队之间来回踢球,或具有技术问题的特定补丁。您可以使用您能够的补丁和团队进行进度。

Divide and automate the workload with CI/CD

我们取得了最初的进展,我们可以并转移到下一阶段。这是导致服​​务定期修补的服务的最大变化。这也是开发商的地方,码头的好处真的很关注我。

In traditional server-based infrastructure, the testing environment is separate from the production environment and those require independent patching and much more task-based overhead to discover whether any particular patches cause problems.

Because Docker images contain their software dependencies in a clearly defined unit of deployment, it becomes straightforward to scan those in CI/CD pipelines. With automated testing that includes the software dependencies, patches can be quickly tried and validated using existing tests.

通过将它集成到CI / CD管道中,漏洞修补不会成为我们每个月所做的大事。它只是烤制送到我们发货的方式。

对于那些计划在您的团队中采用此计划,知道这将首先引起痛苦。即使您的团队在CI / CD中很好,您也要在您的构建管道中添加步骤,它会导致建立违反否则会通过的,人们会沮丧。你必须与他们一起工作以回到绿色。这没关系,但为此做好准备。任何未修补的系统都会暂时修补,显着更加清除时间,因此确保在您进行初始修补时与团队和管理协调。向服务组滚动,并准备好为短时间内提供特定服务的临时例外或禁用执行,而那些初始修补正在发生。

在这个阶段的中央原理是,在开发过程中嵌入的快速反馈循环有助于团队使其成为持续活动的一部分,而不是需要大块单独跟踪的工作。这使得团队更容易遵守。

自动化生产扫描

As game-changing as CI/CD integration is, alone it is not sufficient for vulnerability management. It does a great job of catching vulnerabilities caused by new deploys, but one of the challenges with vulnerabilities is they’re discovered all the time in existing software. A vulnerability could surface today that affects the code that passed all tests and went into production yesterday. CI/CD is only one half of the puzzle; production environment scanning is the other half.

Put another way, for efficiency in interfacing with teams, integrating with CI/CD can’t be beat. But for seeing the full picture of your vulnerabilities, production scanning is required.

回到我们去的电子表格......

… but not for long. Spreadsheets are great for getting an overview of your data, but if you want to start generating tickets and reporting into how your vulnerability management is going, you are going to need more analysis than fits comfortably in a spreadsheet. So we built an integration pipeline that automated the biggest manual work for the security team. This meant taking data out of the production vulnerability scanner and putting it into tickets. As I emphasized before, it’s a really good idea for security teams to meet engineering teams in the tools they are already using on a daily basis, so getting our program to turn data into tickets was essential.

这不仅仅是一个简单的连接器。从漏洞扫描仪中获取数据并将其转换为票证的过程涉及需要我们使用真实编程语言的复杂细节。我选择在Clojure中实施这种自动化,因为这是Circleci的主要开发语言,并且在制定技术决策时非常重要。这必须包括考虑在能够维持工具的公司拥有其他人。188bet娱乐官网

The integration sounds super simple at first glance: Take data out of your prod vulnerability scanner, transform it, and make tickets. Famous last words.

There were a number of challenges to keep in mind with this integration:

We needed to update existing tickets so that open tickets were not duplicated.我们在JIRA中添加了一个奢华MGMT-ID,以跟踪图像/票证身份映射

Both the source and destination APIs changed over time, quite significantly.要处理此操作,我们将在我们与JIRA和扫描工具中集成的API周围构建抽象层。

该服务必须将所有不同的发行版,库和工具的奇怪数据标准化为一致的东西。This ran from normalizing ‘moderate’ vs ‘medium’ (easy), to every vendor having a different format for “fixed in version” (harder).

为了处理这一点,我们构建了一个数据归一化层,即在子包中的分级,重复数据删除的CVES(常见漏洞和曝光),并解析了修复版本的许多不同的字符串格式。我们还规范化包信息,为团队提供关于他们应该修补到票证详细信息内部的特殊要求的明确要求。

如何在工程项目和团队在JIRA进行更改时分配票证。我们创建了一个灵活的团队分配配置,包括正则表达式和部分匹配。如果配置与多个团队或零团队匹配,则团队分配中的另一个主要学习是抛出错误。如果需要任何其他配置,这会通知安全团队。

This took quite a bit of work, but resulted in a simple command line application that could read the latest information, generate detailed tickets, and assign them to teams in a matter of minutes.

此阶段的关键原则是投资自动化,消除安全团队的重复和错误的工作。像灵活的团队配置每周节省时间的东西,因为团队分配或未知容器的小变化通常经常发生。

Reporting on production

要记住安全工作的关键件是做正确的事情是不够的;你必须能够表明你正在做正确的事情。报告中的目标是创建一个报告,涵盖每个人(高管,团队经理,审计师)的需求。这不仅可以节省工作,而且有助于确保每个人都对事物的状态有同样的理解。

Since vulnerability management for Docker differs from traditional patching, we had to work with our auditors so they could understand how we were assigning work and make use of the reports we were generating. Traditionally, sysadmin teams would be the ones doing patching, and that’s what our auditors were used to seeing. It took some work to explain that, in our case, every engineering team across the org would be doing patching. After some back and forth, we added some additional details to our report and were able to design one report format that met the needs of our FedRAMP auditors as well as our internal stakeholders.

此阶段的关键原则是调查和弄清楚满足所有感兴趣的各方的单个报告,然后自动生成它。

优化修补

与个别团队合作,优化它们的补丁。确保这是合作的,而不是决定。目标是帮助他们适应修补现有的工作流程,而不是制作新的工作流程。请记住,工程团队更熟悉他们的流程,并将有一些关于如何以安全团队可能没有想到的方式优化的好主意。

这里最富有成效的三种技术是自动集成测试,共享基本图像和软版本规范。

  1. 自动集成测试通常应该是CI管道的一部分,允许快速应用补丁并更有信心。由于修补程序自动通过整个集成测试套件来检测修补程序引起的问题,这允许修补程序和地区的方法需要更少的人力努力。
  2. 共享基础Docker图像允许具有类似基础设施和软件要求的服务,以基于公共图像。常用更新的集合在该图像中完成,然后向所有特定服务推出。这让大多数Dev团队只需将其作为任何修补中的第一步中的最新共享。
  3. Soft-version specification is done by simply specifying a major version or major + minor version of packages and letting the package manager automatically update to the latest versions as part of a scheduled Docker image build. If that scheduled build runs weekly, and removes much of the human intervention to ongoing patching. There are some operational risks associated with this kind of automated version bumping, so if this level of automation is started, it is important to have thorough integration tests and to try it out on less critical services first.

在此阶段的核心原则是使其更容易应用补丁。对漏洞进行深入分析和优先级排序是安全团队的昂贵且耗时。错误地标记高度严重性漏洞的危险性也不适用于环境。如果补丁流简单且简单,团队将简单地应用补丁并使一切顺利。

Tips for vulnerability management

(或者,如何使用审计师,Docker和CI / CD系统)

  1. Engage with engineering teams using the tools and processes they already have in place.
  2. Don’t be too dogmatic at first.
  3. 构建修补到现有的CI / CD。
  4. Invest in automation of vulnerability management.
  5. 创建一个包含所有利益相关者所需信息的单个报告。
  6. 使贴片简单直截了当。

结束思想

在推出漏洞管理时,很容易陷入早期问题并进入消防模式。请记住,这不是一轮修补,单一的关键漏洞或一个团队或图像。这是一个一致的可重复自动化,不断提高安全性,同时最大限度地减少对工程和安全团队的工作流程的影响。

Tips for vulnerability management