Wednesday, February 27, 2013

Federal IA certification of systems

by Eric Whyne
Also scheduled to be posted on Data Tactics Blog.

From the perspective of a software developer, federal software information assurance evaluation always seems to be an unnecessarily complicated topic. At first glance there seems to be a rigidity and lack of pragmatism or common sense to the process. I don't think that's really the case, but I refuse to add anything to the dialog that might confuse folks even more than they already are. In this post I identify the important parts and break things down into the elements that are common across the various frameworks and describe my approach to each. For the last few years there has been talk of mandating that more systems undergo IA evaluation, specifically those identified as "Critical Infrastructure". Maybe in the future it won't just be systems for the federal government having to undergo IA risk management.

I don't take the topic lightly; my career started in Information Assurance and for the past 8 years I've been involved with organizing a fairly large project that creates the Computer Security Handbook (currently in it's 5th edition). Like that book, this post approaches security certification/assessment from the perspective of a manager (aka project level decision maker) trying to get a project through the process. I wanted to plainly describe the important steps and put forth a pragmatic approach for how to achieve success and avoid common mistakes I've seen made.

When getting systems certified to deploy on Government networks we are essentially just doing risk management and mitigation. A common jest I hear from some engineers is that certification is just a bunch of paperwork. To be honest, that's mostly true. Risk management usually means paperwork. When I was a Marine Corps officer and we were doing Operational Risk Management (ORM) for live fire training exercises it was... paperwork. But it was useful paperwork that saved lives by making us think through the details of what could go wrong and how to prevent it. When approached correctly, security assessments can and do help us deploy safer systems. It's important to keep a positive attitude through the process and stay focused on those goals.

The risk management frameworks we work with in the federal space fall within the Department of Defense Directive 8500.x series or the NIST Risk Management Framework (RMF). NIST RMF is how DCID 6/3 and ICD 503 are implemented guided in the DoD by CNSS 1253 which is complimentary to SP-800-53 and specifically addresses national security systems. Although ICD 503 rescinded the mandate to use DCID 6/3, it continued to be used because there was never published guidance that could be used in place of it. ICD 503 is very broad and designed to be static, outlines basic goals but not how to achieve them. If this sounds confusing, it's because it is.
 
My advice is that if your organization is trying to figure out how to do this from scratch, focus on the NIST RMF approach (CNSS 1253, SP-500-3, whatever revision they are on). Most organizations claim to be transitioning to that. I'll use some DoDD 8500 examples in this post because the fundamentals really stay the same, which is what you need to be focused on. There are minor differences in vocabulary. For example DoD 8500 discusses Certification and Accreditation while NIST RMF uses the terms Assessment and Authorization. Don't get lost in the details of the semantics. Understand that much of what is written can be subject to very broad interpretation where the final decisions rest at the lowest levels (which is where I think they should be). The fundamentals are where you gain traction and are how you make security happen. Your job as the project manager and engineer is to make security happen and document how it was done to demonstrate the level of security to others.

Some advice: read each of the documents if you can find the time. People will always think you are brilliant even if you just spend an hour scanning each important manual. Smart people usually aren't really smarter than anybody, they just work harder and focus more. You'll gain an authority that will help you controls scope later on. It will also give you an awareness of the mindsets driving these processes. I haven't counted, but with some of the documents weighing in over 200 pages, the federal software security certification/assessment guidance probably rivals the complexity of US Tax laws. Because of this complexity and unavoidable contradictions, in any given meeting you'll find that people's understanding of the processes varies widely. When uncertainty creeps into conversations, folks tend to defer decisions to outside authorities. I heard a great term for this in a meeting a few weeks ago: "the gonculator". Unfortunately, there is no magical outside authority that will make the detailed security decisions for you. Take ownership. Understand that cowardice is just as fatal to your project as imprudent courage. With that said, caution and humility regardless of how well you think you understand things is always advised. I've seen engineering managers become frustrated because they have been unable to plan for the various activities required by accreditation. This can cause tension between the IA team and the engineers. Those situations are created by conflicting goals and vocabularies. The two domains are often unable to speak to the goals of the other party in their conceptual frameworks and folks get rightfully frustrated. Here are the basics to help you be more prepared for those dialogues and have some more understanding and courage to ensure the right actions are taken and the best decisions are reached.

In my opinion, as an engineer or an engineering project manager you only really need to have a detailed understanding of the contents of two of the  guiding documents, depending on if you have to do 8500 or NIST RMF. These documents lay out the generic security controls that your system will need to meet. There will be various decisions along the way that determine exactly which generic controls need to be met, but if you don't want to be caught with a nasty surprise you should have a good idea of what the controls are and have a good feeling for the intent of each of them.

For NIST RMF, the generic controls are documented in SP 800-53 "Recommended Security Controls for Federal Information Systems and Organizations".

For 8500, the generic controls are documented in DoDD 8500.2 "Information Assurance (IA) Implementation".

Don't memorize them because, unfortunately, you won't be contending directly with these controls deploying systems. What you'll be contending with is your Information Assurance staff's interpretation of them and those interpretations will vary. For an engineering staff going through the processes of any of these frameworks, the fundamental approaches are going to be the same. There is an evaluation and mitigation of risk which results in a decision to let the system deploy (or not). You will need to undertake five unique, separate, and fundamental engineering activities. These are my "activities" derived from my experience. Think of them as a conceptual framework to put the actual activities of whatever process your dealing with in context. Different processes implement these fundamental approaches in different ways.

1. Implement generic "good idea" guidance. This is where all the process paperwork happens, but some of the guidance is very technical. In 8500 these generic good ideas are called "IA Controls". In NIST RMF these are called simply "Security Controls". They are documented in the two documents that I think you should become familiar with above (SP-800-53 and DoDD 8500.2). This is how you integrate security during the planning stages of development. Having a deliberate planning activity that goes through each technical security control in order of severity and applies it to your design during development will make your system fantastically more secure. At that point, writing ideas about how to verify those controls in the end system pretty much knocks out your test plan. Understand that not doing this stuff early means you'll be incurring technical debt that will have to be paid before you can deploy your system and it only gets more difficult as time goes on.

2. Find out which documented vulnerabilities affect your system, prioritize and fix them. The universal system for referencing vulnerabilities is Common Vulnerabilities and Exposures (CVE) identifiers.

3. Find out which specific secure configuration guidance applies to the specific software in your system and then implement it. In 8500 these are called Secure Technical Implementation Guidlines (STIGs). NIST RMF has taken the software specific implementation guidelines and created an XML dissemination format which automates them; it's called Security Content Automation Procotol (SCAP).

4. Conduct automated vulnerability scanning and evaluate the results. Document the occurrence of false positives (say why they are false) and prioritize and fix any vulnerabilities found.

5. Whatever you can't fix right now, you need to come up with a plan for the future that describes how it will be mitigated or fixed in the future. In 8500 this is called a Plan of Action and Milestones (POA&M).

In most risk management frameworks (thinking Project Management Institute (PMI) approach here), the first step is to identify risks and write them down in a spreadsheet called a risk register. Then you assign two subjective numbers, one of which is how probable you think the risk is and the other being how bad would the effects of the risk happening be. Then you prioritize, conduct risk mitigation (i.e. take action to lower either the probability or affects) on the worst of them, and periodically start that process over (aka monitor risks). This is very productive and it's plain to see how it can help. It certainly helped me avoid Marines getting injured or killed in training accidents, so I'm a believer.

Unfortunately computers and software systems are very complex. The risk are complex and, to be completely honest, nobody has a real clue about how probable the risks are. To combat this complexity the 8500.x accreditation and NIST RMF start off by requiring a determination how much availability, integrity, and risk acceptance the system needs or can have. This assessment is mostly used to decide which generic guidance needs to apply to your system.

Once your system has this classifier associated with it, the process really begins for real. (Important note: Only systems in the context of their deployed environment are accredited not individual software packages.) So let's say your doing DoD 8500 and your system is determined to be MAC II Classified. That's a common determination. That designation will determine what Information Assurance (IA) Controls you are required to implement on the system.   The IA controls are generic guidance about how to configure the system. You'll find the details about IA Controls in DoD 8500.2 in the Enclosures. Specifically Enclosure 2 talks about Mission Assurance Categories and Confidentiality levels then Enclosure 4 describes the IA Controls for each resultant category of system. As soon as possible you need to have the IA staff provide you a list of IA controls you'll have to implement on your system. Have them provide it to you and get them to agree that this is the correct list. Hold a meeting, preferably in person, to review each one. In an 8500 there are usually around 70 specific controls you need to meet. You show that you have met them by taking the required action and writing down what was done or how the system meets the requirement.

Here are IA Control examples and a short discussion of potential pitfalls:

IA Control DCAR-1

"An annual IA review is conducted that comprehensively evaluates existing policies and processes to ensure procedural consistency and to ensure that they fully support the goal of uninterrupted operations."

The four letters DCAR have a special meaning, we don't need to memorize it, so we can just treat it as a categorical identifier (see my post on numbers in data). What this IA Control is saying is that you need to have an annual review. Sounds easy. Where this can go south is when the IA staff wants to kick off right into the first annual review and make generation of each of the procedural documents a requirement for meeting completion of this control. Controlling scope is an important part of getting through the process successfully. If possible, you should push those into the "plan for the future" phase so you can have time to do them correctly. If you end up not being able to control scope here, the worst case scenario is that you burn a few weeks of your schedule in process documentation. But if  you have to do that, do it smartly. Use the time to pay off that technical debt your project has probably been accruing. Cover your bases on the other process focused IA controls. Divide and conquer, make sure your engineering team is working the controls controls specific to technical implementation while you wrestle with the process ones. 

As for other Process focused IA Controls, they might require you to write things like a System Security Plan (SSP) or other security documentation artifacts. Just go with what your IA staff needs or wants because this varies widely from organization to organization. Some organizations are easy and pragmatic about this, some are unnecessarily hard (in my opinion). It has to do with the overall technical competence of the organization and how risk tolerant they are. While you're doing this, plan ahead. When you add new software to the system in the future it will be facilitated by the processes you lay out here. There are tricks to writing good processes that make that easy, but that's left for another blog post.

DCMC-1 Mobile Code
I'm not going to paste in the text for this one here, but it basically says that "mobile code" needs to be signed by a certificate. Yes, JavaScript is mobile code. Before you start panicking about having to cryptographically sign every piece of dynamically generated web app JavaScript in your system, take a deep breath and relax. JavaScript when executed in a browser is exempt from this requirement. Here's the guidance from NIST that you can cite "Within a browser context, JavaScript does not have methods for directly accessing a client file system or for directly opening connections to other computers besides the host that provided the content source. Moreover, the browser normally confines a script’s execution to the page in which it was downloaded." There are DoD documents that say the same thing, your IA staff should know this (but some don't). Still, watch out for things like cross-site scripting vulnerabilities when deploying JavaScript/HTML5 interfaces. If you don't know what those are, you need to study them. Basically if at any time a user can generate JavaScript that will be shown on another user's screen, you're vulnerable.

DCPA-1 Partitioning the Application
"User interface services (e.g., web services) are physically or logically separated from data storage and management services (e.g., database management systems). Separation may be accomplished through the use of different computers, different CPUs, different instances of the operating system, different network addresses, combinations of these methods, or other methods as appropriate."

DCPA-1 is probably my favorite IA Control. I like straight forward and useful technical guidance. Watch out though, because controls like this can cause some major re-factoring of your system if it wasn't built properly. If you tried reducing the number of operating system licenses for your system by cramming stuff together, you might be running into problems here. Next thing you know you're not only changing your budget (buying extra licenses) but your schedule is off as you rebuild the systems to implement the required partitioning of the application. On a related database note: please use prepared SQL statements if you're using a database in your application. If you let users control an unfiltered text variable that gets placed into an SQL statement you might as well be giving them the keys to the database. This is database security 101, but it still happens. Finding mistakes like this is one of the real values of code reviews.

Once you get a good handle on the IA Controls/Security Controls you've been handed, breathe easy. The rest of the process is focused on the specific technologies of your system and is more straight forward. At some point you'll be asked to provide a list of the software on the system and each of the version numbers. What the IA staff wants to do is check two databases for the software.  One is the National Vulnerability Database (NVD), the other is the DISA Secure Technical Implementation Guidelines (STIGs) site or the Security Content Automation Protocol (SCAP) site. Here is the site for the National Checklist Program.

When they look your software up in the NVD you should get back a list of Common Vulnerabilities and Exposures (CVE) identifiers. Just as with the IA Controls make sure they provide you with this list and that everyone agrees on it. CVEs are just publicly known vulnerabilities in the software. The reason the IA staff will get this list is because they will prioritize each vulnerability by assigning it a Category. These range from Cat 1 which is "results in total loss or provides immediate access" to Cat IV which is something like "results in degraded security". You need to come to an agreement about which vulnerabilities need to be addressed before the system goes live. The determination should take into account attack surface and severity. Make sure you know what needs to be addressed. Once everyone is in agreement, have the engineering team start closing the holes and writing down what they do. Since these are all technical undertakings, my approach has been to paste all the CVEs that need to be answered in a wiki. Again, divide and conquer. Set folks to work in accordance with their capabilities and have everyone paste their results and remediation notes into the wiki next to their CVEs. Upon completion, export to a document and you're done. Keep the wiki up and keep building on it as new vulnerabilities come out.

A note on patching vulnerabilities. It's a necessary practice to check if the vulnerability has already been fixed on your system before attempting to patch it! They usually are. This is especially true on Linux systems. CVEs identify software by Version Number, but on modern software systems lots of folks have good reason to use older versions of software. For example older versions tend to be more stable. This means that security patches are almost always back-ported to older versions of software. In order to determine if the software you are using has been patched, look at the release notes. You'll typically find the CVE numbers mentioned in there. On RedHat and Centos you can do this by using the command "rpm -q --changelog <package name>". I never feel bad paying for Linux support licenses because it allows companies like Canonical and Redhat to stay up on this stuff. They are the ones doing all the hard work back-porting security patches. Here is Ubuntu's page on the matter.

STIGs are addressed much like CVEs. Have them provide you the list and assign categories to each then work together to determine prioritization. As the name implies Secure Technical Implementation Guidelines deal with how the software is configured. When your engineering team starts working on the STIGs, have them document what was done to implement the requirement or reduce the attack surface. If you're doing NIST RMF, you'll get the SCAP XML file and tools which should automate things a bit. Stuff will break. That's to be expected, just make sure you interact with the engineers enough to understand what's going on and how the schedule will change. Stuff like this has a potential to create project drag and shift schedules, so keep a handle on it and keep stakeholders updated.

Aside from some general IA Controls/Security Controls, up to this point we've mostly just addressed how to secure software that is popular enough to have made it into the STIG guidance, SCAP, or NVD. Astute technologists will note that there's a good chunk of custom developed code or obscure software that needs to be addressed more closely. You're right. The last technical hurdle is system scans. Typically this will be done by deploying the software to a test environment or conducting non-destructive scanning on the soon-to-be-production system. It always seems like you get an extremely long list of false positive results from any scanning software. I'm not a fan of having to deal with those. If you've ever attended an IA meeting, this is where the huge numbers of "vulnerabilities" might be thrown about. (See my blog post about Automated Vulnerability Scanning to read my advice if you're stuck managing the conduct of scans.) Don't let the IA team throw that list "over the wall" at you and ask you to to put a schedule on it's completion. If they don't acknowledge that at least some of the results are false positives from the start, you're going to have a hard time getting anywhere. Demand a meeting to review the scan results with the IA folks that did the scan and invite key members of the engineering staff. Categorize the results and get some idea of how to prioritize addressing them. Often, whole chunks of results can be easily identified as being false positives in this meeting and can be dismissed immediately. Add a column to the spreadsheet listing the findings and just flag each of them as false positives. Then evaluate the rest that you're unsure of to the satisfaction of the IA staff. Don't waste time. Keep going for the kill and keep closing out items. Above all, don't throw the list of thousands of items "over the wall" to your engineering staff and ask them to come up with a schedule. Take ownership initially and once the list of uncertainties is manageable then start delegating detailed exploration and fixes. Document via the wiki strategy we described previously.

As we've gone through the various hurdles, there was probably a list of items that aren't able to be completed prior to the release of the software. They might be a lower risk category or be just too difficult to fix in a reasonable amount of time. It doesn't mean you need to push back your release schedule. What happens with those items is a planning exercise. I saved mentioning this for last because it should be done late in the process after other options have been explored and the engineering staff has a good grasp of the implications of addressing the open items. Create a Plan of Action and Milestones (POA&M) document which lists each of the open issues, summarizes the risks, and identifies the actions that will be taken to close the issues. 

If you've gotten this far, the last step is to compile all of these artifacts (which should effectively be a list of the actions everybody took to secure the system!) and put them all together in a single package for review. Done effectively this compilation of results provides a realistic estimate of the security of the system and can be used to grant it an authority to operate on the "enclave" that we are trying to introduce it to. Done poorly, the package is a soupy-mash of obscure and useless information that doesn't drive a decision. If at all possible, dive into helping with the creation of this package. At the very least, demand to see it prior to it being shown to the decision maker. Often the person making the decision is so senior that they haven't been involved in the assessment and security process up until this point. Make sure that your system makes a good first impression. If the package reflects poorly on the system, immediately address the issues and push back the decision. Don't let it go forward and just hope for a good result. Authorization to operate is ephemeral and come in various flavors which designate just how temporary they are. An Interim Authority To Test (IATT) can be valid for just a few weeks or months. A full blown Authority To Operate (ATO) can be valid for much longer before the system must be reassessed.

My hope is that this guidance has assisted somewhat in providing some clarity on the process. Guidance periodically changes, but the five fundamental activities I've laid out here are common sense and common across the various frameworks of the past. Security means way more than just checking vulnerabilities and patching them. It requires addressing the system as a whole and being diligent about all five aspects of this process. Keep a working awareness of the generic "good idea" documents I've described above and make the knowledge in them part of your engineering culture. You'll write more secure software and alleviate some of the engineering anxiety when approaching certification time.