Friday, October 26, 2007

Developing IT System Security Plans

Introduction

The objective of system security planning is to improve protection of information technology (IT) resources. The purpose of the security plan is to provide an overview of the security requirements of the system and describe the controls in place or planned for meeting those requirements. A typical computer system security plan briefly describes the important security considerations for the system and provides references to more detailed documents, such as system security plans, contingency plans, training programs, accreditation statements, incident handling plans, or audit results. This enables the plan to be used as a management tool without requiring repetition of existing documents. For smaller systems, the plan may include all security documentation. As with other security documents, if a plan addresses specific vulnerabilities or other information that could compromise the system, it should be kept private. It also has to be kept up-to-date.

The recommended approach is to draw up the plan at the beginning of the computer system life cycle. Security, like other aspects of a computer system, is best managed if planned for throughout the computer system life cycle. It has long been a tenet of the computer community that it costs ten times more to add a feature in a system after it has been designed than to include the feature in the system at the initial design phase. The principal reason for implementing security during a system's development is that it is more difficult to implement it later (as is usually reflected in the higher costs of doing so). It also tends to disrupt ongoing operations.

The system security plan also delineates responsibilities and expected behavior of all individuals who access the system. The security plan should be viewed as documentation of the structured process of planning adequate, cost-effective security protection for a system. It should reflect input from various managers with responsibilities concerning the system, including information owners, the system operator, and the system security manager. Additional information may be included in the basic plan and the structure and format organized according to agency needs, so long as the major sections described in this document are adequately covered and readily identifiable. In order for the plans to adequately reflect the protection of the resources, a management official must authorize a system to process information or operate. The authorization of a system to process information, granted by a management official, provides an important quality control. By authorizing processing in a system, the manager accepts its associated risk.

Management authorization should be based on an assessment of management, operational, and technical controls. Since the security plan establishes and documents the security controls, it should form the basis for the authorization, supplemented by more specific studies as needed. In addition, a periodic review of controls should also contribute to future authorizations. Re-authorization should occur prior to a significant change in processing, but at least every three years. It should be done more often where there is a high risk and potential magnitude of harm.

System Analysis

Once completed, a security plan will contain technical information about the system, its security requirements, and the controls implemented to provide protection against its risks and vulnerabilities. You will need to perform an analysis of the system to determine the boundaries of the system and the type of system.

System Boundaries

Defining what constitutes a "system" for the purposes of this guide requires an analysis of system boundaries and organizational responsibilities. A system, as defined here, is identified by constructing logical boundaries around a set of processes, communications, storage, and related resources. The elements within these boundaries constitute a single system requiring a security plan. Each element of the system must:

  • Be under the same direct management control;

  • Have the same function or mission objective;

  • Have essentially the same operating characteristics and security needs; and

  • Reside in the same general operating environment.

ll components of a system need not be physically connected (e.g., [1] a group of stand-alone personal computers (PCs) in an office; [2] a group of PCs placed in employees' homes under defined telecommuting program rules; [3] a group of portable PCs provided to employees who require mobile computing capability for their jobs; and [4] a system with multiple identical configurations that are installed in locations with the same environmental and physical safeguards).

Multiple Similar Systems

An organization may have systems that differ only in the responsible organization or the physical environment in which they are located (e.g., air traffic control systems). In such instances, it is appropriate and recommended to use plans that are identical except for those areas of difference. This approach provides consistent levels of protection for similar systems.


Confidentiality, Integrity and Availability in an Information Technology System Security Plan

Both information and information systems have distinct life cycles. It is important that the degree of sensitivity of information be assessed by considering the Requirements for availability, integrity, and confidentiality of the information. This process should occur at the beginning of the information system's life cycle and be re-examined during each life cycle stage. The integration of security considerations early in the life cycle avoids costly retrofitting of safeguards. However, security requirements can be incorporated during any life cycle stage. The purpose of this section is to review the system requirements against the need for availability, integrity, and confidentiality. By performing this analysis, the value of the system can be determined. The value is one of the first major factors in risk management. A system may need protection for one or more of the following reasons:

  • Confidentiality - The system contains information that requires protection from unauthorized disclosure.
  • Integrity - The system contains information which must be protected from unauthorized, unanticipated, or unintentional modification.
  • Availability - The system contains information or provides services which must be available on a timely basis to meet mission requirements or to avoid substantial losses.

A security plan for an information technology system should describe, in general terms, the information handled by the system and the need for protective measures. It needs to relate the information handled to each of the three basic protection requirements above (confidentiality, integrity and availability). It includes statement of the estimated risk and magnitude of harm resulting from the loss, misuse, or unauthorized access to or modification of information in the system. To the extent possible, it describes this impact in terms of cost, inability to carry out mandated functions, timeliness, etc. For each of the three categories (confidentiality, integrity and availability), it provides an evaluation, and indicates if the protection requirement is:

  • High - a critical concern of the system;

  • Medium - an important concern, but not necessarily paramount in the organization's priorities; or

  • Low - some minimal level or security is required, but not to the same degree as the previous two categories.

Examples of a General Protection Requirement Statement

A high degree of security for the system is considered mandatory to protect the confidentiality, integrity, and availability of information. The protection requirements for all applications are critical concerns for the system.

Or, confidentiality is not a concern for this system as it contains information intended for immediate release to the general public concerning severe storms. The integrity of the information, however, is extremely important to ensure that the most accurate information is provided to the public to allow them to make decisions about the safety of their families and property. The most critical concern is to ensure that the system is available at all times to acquire, process, and provide warning information immediately about life-threatening storms.

Example of Confidentiality Considerations

Evaluation: High, Medium Low

High - The application contains proprietary business information and other financial information, which if disclosed to unauthorized sources, could cause unfair advantage for vendors, contractors, or individuals and could result in financial loss or adverse legal action to user organizations.

Medium - Security requirements for assuring confidentiality are of moderate importance. Having access to only small portions of the information has little practical purpose and the satellite imagery data does not reveal information involving national security.

Low - The mission of this system is to produce local weather forecast information that is made available to the news media forecasters and the general public at all times. None of the information requires protection against disclosure.

Example of Integrity Considerations

Evaluation: High, Medium Low

High - The application is a financial transaction system. Unauthorized or unintentional modification of this information could result in fraud, under or over payments of obligations, fines, or penalties resulting from late or inadequate payments, and loss of public confidence.

Medium - Assurance of the integrity of the information is required to the extent that destruction of the information would require significant expenditures of time and effort to replace. Although corrupted information would present an inconvenience to the staff, most information, and all vital information, is backed up by either paper documentation or on disk.

Low - The system mainly contains messages and reports. If these messages and reports were modified by unauthorized, unanticipated or unintentional means, employees would detect the modifications; however, these modifications would not be a major concern for the organization.

Example of Availability Considerations

Evaluation: High, Medium Low

High - The application contains personnel and payroll information concerning employees of the various user groups. Unavailability of the system could result in inability to meet payroll obligations and could cause work stoppage and failure of user organizations to meet critical mission requirements. The system requires 24-hour access.

Medium - Information availability is of moderate concern to the mission. Macintosh and IBM PC availability would be required within the four to five-day range. Information backups maintained at off-site storage would be sufficient to carry on with limited office tasks.

Low - The system serves primarily as a server for e-mail for the seven users of the system. Conference messages are duplicated between Seattle and D.C. servers. Should the system become unavailable, the D.C. users would connect to the Seattle server and continue to work with only the loss of old mail messages.


Physical and Environmental Protection

Physical and environmental security controls are implemented to protect the facility housing system resources, the system resources themselves, and the facilities used to support their operation. An organization's physical and environmental security program should address the following seven topics which are explained below. This section briefly describes the physical and environmental security controls that should be in place for a major application.

Explanation of Physical and Environment Security

Access Controls

Physical access controls restrict the entry and exit of personnel (and often equipment and media) from an area, such as an office building, suite, data center, or room containing a local area network (LAN) server. Physical access controls should address not only the area containing system hardware, but also locations of wiring used to connect elements of the system, supporting services (such as electric power), backup media, and any other elements required for the system's operation. It is important to review the effectiveness of physical access controls in each area, both during normal business hours and at other times -- particularly when an area may be unoccupied.

Environmental Conditions

For many types of computer equipment, strict environmental conditions must be maintained. Manufacturer's specifications should be observed for temperature, humidity, and electrical power requirements.

Control of Media

The media upon which information is stored should be carefully controlled. Transportable media such as tapes and cartridges should be kept in secure locations, and accurate records kept of the location and disposition of each. In addition, media from an external source should be subject to a check-in process to ensure it is from an authorized source.

Control of Physical Hazards

Each area should be surveyed for potential physical hazards. Fire and water are two of the most damaging forces with regard to computer systems. Opportunities for loss should be minimized by an effective fire detection and suppression mechanism, and planning reduces the danger of leaks or flooding. Other physical controls include reducing the visibility of the equipment and strictly limiting access to the area or equipment.

Fire Safety Factors

Building fires are a particularly important security threat because of the potential for complete destruction of both hardware and data, the risk to human life, and the pervasiveness of the damage. Smoke, corrosive gases, and high humidity from a localized fire can damage systems throughout an entire building. Consequently, it is important to evaluate the fire safety of buildings that house systems.

Failure of Supporting Utilities

Systems and the people who operate them need to have a reasonably well-controlled operating environment. Consequently, failures of electric power, heating and air-conditioning systems, water, sewage, and other utilities will usually cause a service interruption and may damage hardware. Organizations should ensure that these utilities, including their many elements, function properly.

Structural Collapse

Organizations should be aware that a building may be subjected to a load greater than it can support. Most commonly this results from an earthquake, a snow load on the roof beyond design criteria, an explosion that displaces or cuts structural members, or a fire that weakens structural members.

Plumbing Leaks

While plumbing leaks do not occur every day, they can be seriously disruptive. An organization should know the location of plumbing lines that might endanger system hardware and take steps to reduce risk (e.g., moving hardware, relocating plumbing lines, and identifying shutoff valves.)

Interception of Data

Depending on the type of data a system processes, there may be a significant risk if the data is intercepted. Organizations should be aware that there are three routes of data interception: direct observation, interception of data transmission, and electromagnetic interception.

Mobile and Portable Systems

The analysis and management of risk usually has to be modified if a system is installed in a vehicle or is portable, such as a laptop computer. The system in a vehicle will share the risks of the vehicle, including accidents and theft, as well as regional and local risks. Organizations should:

  • Securely store laptop computers when they are not in use; and
  • Encrypt data files on stored media, when cost-effective, as a precaution against disclosure of information if a laptop computer is lost or stolen.

Computer Room Example

Appropriate and adequate controls will vary depending on the individual system requirements. The example list shows the types of controls for an application residing on a system in a computer room. The list is not intended to be all- inclusive or to imply that all systems should have all controls listed.

Production, Input/Output Controls

The information technology system security plan should provide a synopsis of the procedures in place that support the operations of the application. Below is a sampling of topics that should be reported.

For Computer Room:

In Place:

  • Card keys for building and work-area entrances
  • Twenty-four hour guards at all entrances/exits
  • Cipher lock on computer room door
  • Raised floor in computer room
  • Dedicated cooling system
  • Humidifier in tape library
  • Emergency lighting in computer room
  • Four fire extinguishers rated for electrical fires
  • One B/C-rated fire extinguisher
  • Smoke, water, and heat detectors
  • Emergency power-off switch by exit door
  • Surge suppressor
  • Emergency replacement server
  • Zoned dry pipe sprinkler system
  • Uninterruptable power supply for LAN servers
  • Power strips/suppressors for peripherals
  • Power strips/suppressors for computers
  • Controlled access to file server room

Planned:

  • Plastic sheets for water protection
  • Closed-circuit television monitors

Procedures to Use:

  • How to recognize, handle, and report incidents and/or problems?
  • Procedures to ensure unauthorized individuals cannot read, copy, alter, or steal printed or electronic information.
  • Procedures for ensuring that only authorized users pick up, receive, or deliver input and output information and media.
  • Audit trails for receipt of sensitive inputs/outputs.
  • Procedures for restricting access to output products.
  • Procedures and controls used for transporting or mailing media or printed output.
  • Internal/external labeling for appropriate sensitivity (e.g., Privacy Act, Proprietary).
  • External labeling with special handling instructions (e.g., log/inventory identifiers, controlled access, special storage instructions, release or destruction dates).
  • Audit trails for inventory management.
  • Media storage vault or library physical and environmental protection controls and procedures.
  • Procedures for sanitizing electronic media for reuse (e.g., overwrite or degaussing of electronic media).
  • Procedures for controlled storage, handling, or destruction of spoiled media or media that cannot be effectively sanitized for reuse.
  • Procedures for shredding or other destructive measures for hardcopy media when no longer required.


Data Integrity/Validation Controls

Data integrity controls are used to protect data from accidental or malicious alteration or destruction and to provide assurance to the user that the information meets expectations about its quality and that it has not been altered. Validation controls refer to tests and evaluations used to determine compliance with security specifications and requirements.

Security controls should be in place providing assurance to users that the information has not been altered and that the system functions as expected. The following questions are examples of some of the controls that fit in this category:

  • Is virus detection and elimination software installed? If so, are there procedures for:
    • Updating virus signature files;
    • Automatic and/or manual virus scans (automatic scan on network log-in, automatic scan on client/server power on, automatic scan on diskette insertion, automatic scan on download from an unprotected source such as the Internet, scan for macro viruses); and
    • Virus eradication and reporting?
  • Are reconciliation routines used by the system, i.e., checksums, hash totals, record counts? Include a description of the actions taken to resolve any discrepancies.
  • Are password crackers/checkers used?
  • Are integrity verification programs used by applications to look for evidence of data tampering, errors, and omissions? Techniques include consistency and reasonableness checks and validation during data entry and processing.
  • Describe the integrity controls used within the system.
  • Are intrusion detection tools installed on the system? Describe where the tool(s) are placed, the type of processes detected/reported, and the procedures for handling intrusions.
  • Is system performance monitoring used to analyze system performance logs in real time to look for availability problems, including active attacks, and system and network slowdowns and crashes?
  • Is penetration testing performed on the system? If so, what procedures are in place to ensure they are conducted appropriately?
  • Is message authentication used in the application to ensure that the sender of a message is known and that the message has not been altered during transmission?

State whether message authentication has been determined to be appropriate for your system. If so, describe the methodology.


Documentation

Documentation is a security control in that it explains how software/hardware is to be used and formalizes security and operational procedures specific to the system. Documentation for a system includes descriptions of the hardware and software, policies, standards, procedures, and approvals related to automated information system security in the application and the support system(s) on which it is processed, to include backup and contingency activities, as well as descriptions of user and operator procedures. Documentation should be coordinated with the general support system and/or network manager(s) to ensure that adequate application and installation documentation are maintained to provide continuity of operations List the documentation maintained for the application. The example list is provided to show the type of documentation that would normally be maintained for a system and is not intended to be all inclusive or imply that all systems should have all items listed.

Examples of Documentation for a Major Application:

  • Vendor-supplied documentation of hardware
  • Vendor-supplied documentation of software
  • Application requirements
  • Application security plan
  • General support system(s) security plan(s)
  • Application program documentation and specifications
  • Testing procedures and results
  • Standard operating procedures
  • Emergency procedures
  • Contingency plans
  • Memoranda of understanding with interfacing systems
  • Disaster recovery plans
  • User rules of behavior
  • User manuals
  • Risk assessment
  • Backup procedures
  • Authorize processing documents and statement


Technical Controls

Technical controls focus on security controls that the computer system executes. The controls can provide automated protection from unauthorized access or misuse, facilitate detection of security violations, and support security requirements for applications and data. The implementation of technical controls, however, always requires significant operational considerations and should be consistent with the management of security within the organization. In this section, describe the technical control measures (in place or planned) that are intended to meet the protection requirements of the major application.

Identification and Authentication

Identification and Authentication is a technical measure that prevents unauthorized people (or unauthorized processes) from entering an IT system. Access control usually requires that the system be able to identify and differentiate among users. For example, access control is often based on least privilege, which refers to the granting to users of only those accesses minimally required to perform their duties. User accountability requires the linking of activities on an IT system to specific individuals and, therefore, requires the system to identify users.

Identification

Identification is the means by which a user provides a claimed identity to the system. The most common form of identification is the user ID. In this section of your IT system security plan, briefly describe how the major application identifies access to the system.

Unique Identification

An organization should require users to identify themselves uniquely before being allowed to perform any actions on the system unless user anonymity or other factors dictate otherwise.

Correlate Actions to Users

The system should internally maintain the identity of all active users and be able to link actions to specific users.

Maintenance of User IDs

An organization should ensure that all user IDs belong to currently authorized users. Identification data must be kept current by adding new users and deleting former users.

Inactive User IDs

User IDs that are inactive on the system for a specific period of time (e.g., three months) should be disabled.

Authentication

Authentication is the means of establishing the validity of a user's claimed identity to the system. There are three means of authenticating a user's identity which can be used alone or in combination:
something the individual knows (a secret -- e.g., a password, Personal Identification Number (PIN), or cryptographic key); something the individual possesses (a token -- e.g., an ATM card or a smart card); and something the individual is (a biometrics -- e.g., characteristics such as a voice pattern, handwriting dynamics, or a fingerprint).

For most applications, trade-offs will have to be made when evaluating the mode of authentication, including ease of use, and ease of administration, especially in modern networked environments. While it may appear that any of these means could provide strong authentication, there are problems associated with each. If people wanted to pretend to be someone else on a computer system, they can guess or learn that individual's password; they can also steal or fabricate tokens. Each method also has drawbacks for legitimate users and system administrators: users forget passwords and may lose tokens, and administrative overhead for keeping track of I&A data and tokens can be substantial. Biometric systems have significant technical, user acceptance, and cost problems as well.


This section of your IT system security plan describes the major application's authentication control mechanisms. Below is a list of items that should be considered in the description:

  • Descscribe the method of user authentication (password, token, and biometrics).
  • If a password system is used, provide the following specific information:
    • Allowable character set,
    • Password length (minimum, maximum),
    • Password aging time frames and enforcement approach,
    • Number of generations of expired passwords disallowed for use,
    • Procedures for password changes,
    • Procedures for handling lost passwords, and
    • Procedures for handling password compromise.
    • Indicate the frequency of password changes, describe how password changes are enforced (e.g., by the software or System Administrator), and identify who changes the passwords (the user, the system, or the System Administrator).
    • Note: The recommended minimum number of characters in a password is six to eight characters in a combination of alpha, numeric, or special characters.
  • Describe any biometrics controls used. Include a description of how the biometrics controls are implemented on the system.
  • Describe any token controls used on the system and how they are implemented.
  • Are special hardware readers required?
  • Are users required to use a unique Personal Identification Number (PIN)?
  • Who selects the PIN, the user or System Administrator?
  • Does the token use a password generator to create a one-time password?
  • Is a challenge-response protocol used to create a one-time password?
  • Describe the level of enforcement of the access control mechanism (network, operating system, and application).
  • Describe how the access control mechanism supports individual accountability and audit trails (e.g., passwords are associated with a user identifier that is assigned to a single individual).
  • Describe the self-protection techniques for the user authentication mechanism (e.g., passwords are transmitted and stored with one-way encryption to prevent anyone [including the System Administrator] from reading the clear-text passwords, passwords are automatically generated, passwords are checked against a dictionary of disallowed passwords, passwords are encrypted while in transmission).
  • State the number of invalid access attempts that may occur for a given user identifier or access location (terminal or port) and describe the actions taken when that limit is exceeded.
  • Describe the procedures for verifying that all system-provided administrative default passwords have been changed.
  • Describe the procedures for limiting access scripts with embedded passwords (e.g., scripts with embedded passwords are prohibited, scripts with embedded passwords are allowed only for batch applications).
  • Describe any policies that provide for bypassing user authentication requirements, single-sign-on technologies (e.g., host-to-host, authentication servers, user-to-host identifier, and group user identifiers) and any compensating controls.
  • Describe any use of digital or electronic signatures. Address the following specific issues:
  • Describe any use of digital or electronic signatures and the security control provided.
  • Discuss cryptographic key management procedures for key generation, distribution, storage, entry, use, destruction and archiving.
  • Procedures for training users and the materials covered.


Logical Access Controls (Authorization/Access Controls)

Logical access controls are the system-based mechanisms used to specify who or what (e.g., in the case of a process) is to have access to a specific system resource and the type of access that is permitted. Here, your IT system security plan discusses the controls in place to authorize or restrict the activities of users and system personnel within the application. Describe hardware or software features that are designed to permit only authorized access to or within the application, to restrict users to authorized transactions and functions, and/or to detect unauthorized activities (e.g., access control lists). The following are areas that should be considered.

  • Describe formal policies that define the authority that will be granted to each user or class of users. Indicate if these policies follow the concept of least privilege which requires identifying the user's job functions, determining the minimum set of privileges required to perform that function, and restricting the user to a domain with those privileges and nothing more. Include in the description the procedures for granting new users access and the procedures for when the role or job function changes.
  • Identify whether the policies include separation of duties enforcement to prevent an individual from having all necessary authority or information access to allow fraudulent activity without collusion.
  • Describe the application's capability to establish an Access Control List or register of the users and the types of access they are permitted.
    Indicate whether a manual Access Control List is maintained.
  • Indicate if the security software allows application owners to restrict the access rights of other application users, the general support system administrator, or operators to the application programs, data, or files.
  • Describe how application users are restricted from accessing the operating system, other applications, or other system resources not needed in the performance of their duties.
  • Indicate how often Access Control Lists are reviewed to identify and remove users who have left the organization or whose duties no longer require access to the application.
  • Describe controls to detect unauthorized transaction attempts by authorized and/or unauthorized users.
  • Describe policy or logical access controls that regulate how users may delegate access permissions or make copies of files or information accessible to other users. This "discretionary access control" may be appropriate for some applications, and inappropriate for others.
  • Document any evaluation made to justify/support use of "discretionary access control."
    Indicate after what period of user inactivity the system automatically blanks associated display screens and/or after what period of user inactivity the system automatically disconnects inactive users or requires the user to enter a unique password before reconnecting to the system or application.
  • Describe any restrictions to prevent users from accessing the system or applications outside of normal work hours or on weekends.
  • Discuss in-place restrictions.
  • Indicate if encryption is used to prevent unauthorized access to sensitive files as part of the system or application access control procedures. (If encryption is used primarily for authentication, include this information in the section above.) If encryption is used as part of the access controls, provide information about the following:
    • What cryptographic methodology (e.g., secret key and public key) is used?
    • If a specific off-the-shelf product is used, provide the name of the product.
    • If the product and the implementation method meet standards (e.g., Data Encryption Standard, Digital Signature Standard), include that information.
    • Discuss cryptographic key management procedures for key generation, distribution, storage, entry, use, destruction, and archiving.
  • If your application is running on a system that is connected to the Internet or other wide area network(s), discuss what additional hardware or technical controls have been installed and implemented to provide protection against unauthorized system penetration and other known Internet threats and vulnerabilities.
  • Describe any type of secure gateway or firewall in use, including its configuration, (e.g., configured to restrict access to critical system resources and to disallow certain types of traffic to pass through to the system).
  • Provide information regarding any port protection devices used to require specific access authorization to the communication ports, including the configuration of the port protection devices, and if additional passwords or tokens are required.
  • Identify whether internal security labels are used to control access to specific information types or files, and if such labels specify protective measures or indicate additional handling instructions.
  • Indicate if host-based authentication is used. (This is an access control approach that grants access based on the identity of the host originating the request, instead of the individual user requesting access.)


Conducting a Sensitivity Assessment

A sensitivity assessment looks at the sensitivity of both the information to be processed and the system itself. The assessment should consider legal implications, organization policy, and the functional needs of the system. Sensitivity is normally expressed in terms of integrity, availability, and confidentiality. Such factors as the importance of the system to the organization's mission and the consequences of unauthorized modification, unauthorized disclosure, or unavailability of the system or data need to be examined when assessing sensitivity. To address these types of issues, the people who use or own the system or information should participate in the assessment.

A sensitivity assessment should answer the following questions:

  • What information is handled by the system?
  • What kind of potential damage could occur through error, unauthorized disclosure or modification, or unavailability of data or the system?
  • What laws or regulations affect security (e.g., the Privacy Act or the Fair Trade Practices Act)?
  • To what threats is the system or information particularly vulnerable?
  • Are there significant environmental considerations (e.g., hazardous location of system)?
  • What are the security-relevant characteristics of the user community (e.g., level of technical sophistication and training or security clearances)?
  • What internal security standards, regulations, or guidelines apply to this system?


Operational Assurance

Security is never perfect when a system is implemented. In addition, system users and operators discover new ways to intentionally or unintentionally bypass or subvert security. Changes in the system or the environment can create new vulnerabilities. Strict adherence to procedures is rare over time, and procedures become outdated. Thinking risk is minimal, users may tend to bypass security measures and procedures.

Operational assurance is one way of becoming aware of these changes whether they are new vulnerabilities (or old vulnerabilities that have not been corrected), system changes, or environmental changes. Operational assurance is the process of reviewing an operational system to see that security controls, both automated and manual, are functioning correctly and effectively.

Design and implementation assurance addresses the quality of security features built into systems. Operational assurance addresses whether the system's technical features are being bypassed or have vulnerabilities and whether required procedures are being followed. It does not address changes in the system's security requirements, which could be caused by changes to the system and its operating or threat environment.

Security tends to degrade during the operational phase of the system life cycle. System users and operators discover new ways to intentionally or unintentionally bypass or subvert security (especially if there is a perception that bypassing security improves functionality). Users and administrators often think that nothing will happen to them or their system, so they shortcut security. Strict adherence to procedures is rare, and they become outdated, and errors in the system's administration commonly occur.

Organizations use two basic methods to maintain operational assurance:

System Audit - A one-time or periodic event to evaluate security. An audit can vary widely in scope: it may examine an entire system for the purpose of re-accreditation or it may investigate a single anomalous event.

Monitoring - An ongoing activity that checks on the system, its users, or the environment.
These terms are used loosely within the computer security community and often overlap. A system audit is a one-time or periodic event to evaluate security. Monitoring refers to an ongoing activity that examines either the system or the users. In general, the more "real-time" an activity is, the more it falls into the category of monitoring. Daily or weekly reviewing of the audit trail (for unauthorized access attempts) is generally monitoring, while an historical review of several months' worth of the trail (tracing the actions of a specific user) is probably an audit.

An audit conducted to support operational assurance examines whether the system is meeting stated or implied security requirements including system and organization policies. The essential difference between a self-audit and an independent audit is objectivity. Reviews done by system management staff, often called self-audits/ assessments, have an inherent conflict of interest.

Automated security audit tools make it feasible to review even large computer systems for a variety of security flaws. There are two types of automated tools: (1) active tools, which find vulnerabilities by trying to exploit them, and (2) passive tests, which only examine the system and infer the existence of problems from the state of the system.

Automated tools can be used to help find a variety of threats and vulnerabilities, such as improper access controls or access control configurations, weak passwords, lack of integrity of the system software, or not using all relevant software updates and patches. These tools are often very successful at finding vulnerabilities and are sometimes used by hackers to break into systems. Not taking advantage of these tools puts system administrators at a disadvantage. Many of the tools are simple to use; however, some programs (such as access-control auditing tools for large mainframe systems) require specialized skill to use and interpret.

Several types of automated tools monitor a system for security problems. Some examples follow:

Virus scanners are a popular means of checking for virus infections. These programs test for the presence of viruses in executable program files.

Checksumming presumes that program files should not change between updates. They work by generating a mathematical value based on the contents of a particular file. When the integrity of the file is to be verified, the checksum is generated on the current file and compared with the previously generated value. If the two values are equal, the integrity of the file is verified. Program checksumming can detect viruses, Trojan horses, accidental changes to files caused by hardware failures, and other changes to files. However, they may be subject to covert replacement by a system intruder. Digital signatures can also be used.

Password crackers check passwords against a dictionary (either a "regular" dictionary or a specialized one with easy-to-guess passwords) and also check if passwords are common permutations of the user ID.

Examples of special dictionary entries could be the names of regional sports teams and stars; common permutations could be the user ID spelled backwards.

Integrity verification programs can be used by such applications to look for evidence of data tampering, errors, and omissions. Techniques include consistency and reasonableness checks and validation during data entry and processing. These techniques can check data elements, as input or as processed, against expected values or ranges of values; analyze transactions for proper flow, sequencing, and authorization; or examine data elements for expected relationships. These programs comprise a very important set of processes because they can be used to convince people that, if they do what they should not do, accidentally or intentionally, they will be caught. Many of these programs rely upon logging of individual user activities.

Intrusion detectors analyze the system audit trail, especially log-ons, connections, operating system calls, and various command parameters, for activity that could represent unauthorized activity.

System performance monitoring analyzes system performance logs in real time to look for availability problems, including active attacks (such as the 1988 Internet worm) and system and network slowdowns and crashes.

An auditor can review controls in place and determine whether they are effective. The auditor will often analyze both computer and non-computer based controls. Techniques used include inquiry, observation, and testing (of both the controls themselves and the data). The audit can also detect illegal acts, errors, irregularities, or a lack of compliance with laws and regulations. Security checklists and penetration testing, discussed below, may be used.


Computer Security Incident Handling

Computer systems are subject to a wide range of mishaps -- from corrupted data files, to viruses, to natural disasters. Some of these mishaps can be fixed through standard operating procedures. For example, frequently occurring events (e.g., a mistakenly deleted file) can usually be readily repaired (e.g., by restoration from the backup file). More severe mishaps, such as outages caused by natural disasters, are normally addressed in an organization's contingency plan. Other damaging events result from deliberate malicious technical activity (e.g., the creation of viruses or system hacking).

A computer security incident can result from a computer virus, other malicious code, or a system intruder, either an insider or an outsider. Although the threats that hackers and malicious code pose to systems and networks are well known, the occurrence of such harmful events remains unpredictable. Security incidents on larger networks (e.g., the Internet), such as break-ins and service disruptions, have harmed various organizations' computing capabilities. It is cost-beneficial to develop a standing capability for quick discovery of and response to such events. This is especially true, since incidents can often "spread" when left unchecked thus increasing damage and seriously harming an organization. This chapter describes how organizations can address computer security incidents (in the context of their larger computer security program) by developing a computer security incident handling capability.

The primary benefits of an incident handling capability are containing and repairing damage from incidents, and preventing future damage. An incident handling capability provides a way for users to report incidents and the appropriate response and assistance to be provided to aid in recovery. Technical capabilities (e.g., trained personnel and virus identification software) are pre-positioned, ready to be used as necessary. Moreover, the organization will have already made important contacts with other supportive sources (e.g., legal, technical, and managerial) to aid in containment and recovery efforts. Intruder activity, whether hackers or malicious code, can often affect many systems located at many different network sites; thus, handling the incidents can be logistically complex and can require information from outside the organization. By planning ahead, such contacts can be pre-established and the speed of response improved, thereby containing and minimizing damage.

As in any set of pre-planned procedures, attention must be paid to a set of goals for handling an incident. These goals will be prioritized differently depending on the site. A specific set of objectives can be identified for dealing with incidents:

(1) Figure out how it happened.
(2) Find out how to avoid further exploitation of the same vulnerability.
(3) Avoid escalation and further incidents.
(4) Assess the impact and damage of the incident.
(5) Recover from the incident.
(6) Update policies and procedures as needed.
(7) Find out who did it (if appropriate and possible).

Due to the nature of the incident, there might be a conflict between analyzing the original source of a problem and restoring systems and services. Overall goals (like assuring the integrity of critical systems) might be the reason for not analyzing an incident. Of course, this is an important management decision; but all involved parties must be aware that without analysis the same incident may happen again.

It is also important to prioritize the actions to be taken during an incident well in advance of the time an incident occurs. Sometimes an incident may be so complex that it is impossible to do everything at once to respond to it; priorities are essential. Although priorities will vary from institution to institution, the following suggested priorities may serve as a starting point for defining your organization's response:

(1) Priority one -- protect human life and people's safety; human life always has precedence over all other considerations.

(2) Priority two -- protect classified and/or sensitive data. Prevent exploitation of classified and/or sensitive systems, networks or sites. Inform affected classified and/or sensitive systems, networks or sites about already occurred penetrations. (Be aware of regulations by your site or by government)

(3) Priority three -- protect other data, including proprietary, scientific, managerial and other data, because loss of data is costly in terms of resources. Prevent exploitations of other systems, networks or sites and inform already affected systems, networks or sites about successful penetrations.

(4) Priority four -- prevent damage to systems (e.g., loss or alteration of system files, damage to disk drives, etc.). Damage to systems can result in costly down time and recovery.

(5) Priority five -- minimize disruption of computing resources (including processes). It is better in many cases to shut a system down or disconnect from a network than to risk damage to data or systems. Sites will have to evaluate the trade-offs between shutting down and
disconnecting, and staying up. There may be service agreements in place that may require keeping systems up even in light of further damage occurring. However, the damage and scope of an incident may be so extensive that service agreements may have to be over-ridden.

An important implication for defining priorities is that once human life and national security considerations have been addressed, it is generally more important to save data than system software and hardware. Although it is undesirable to have any damage or loss during an incident, systems can be replaced. However, the loss or compromise of data (especially classified or proprietary data) is usually not an acceptable outcome under any circumstances.

An incident handling capability also assists an organization in preventing (or at least minimizing) damage from future incidents. Incidents can be studied internally to gain a better understanding of the organization's threats and vulnerabilities so more effective safeguards can be implemented. Additionally, through outside contacts (established by the incident handling capability) early warnings of threats and vulnerabilities can be provided. Mechanisms will already be in place to warn users of these risks.

The incident handling capability allows an organization to learn from the incidents that it has experienced. Data about past incidents (and the corrective measures taken) can be collected. The data can be analyzed for patterns -- for example, which viruses are most prevalent, which corrective actions are most successful, and which systems and information are being targeted by hackers. Vulnerabilities can also be identified in this process -- for example, whether damage is occurring to systems when a new software package or patch is used. Knowledge about the types of threats that are occurring and the presence of vulnerabilities can aid in identifying security solutions. This information will also prove useful in creating a more effective training and awareness program -- and thus help reduce the potential for losses. The incident handling capability assists the training and awareness program by providing information to users as to (1) measures that can help avoid incidents (e.g., virus scanning) and (2) what should be done in case an incident does occur.

A successful incident handling capability has several core characteristics:

  • an understanding of the constituency it will serve;
  • an educated constituency;
  • a means for centralized communications;
  • expertise in the requisite technologies; and
  • links to other groups to assist in incident handling (as needed).

Incident handling will be greatly enhanced by technical mechanisms that enable the dissemination of information quickly and conveniently. The technical ability to report incidents is of primary importance, since without knowledge of an incident, response is precluded. Fortunately, such technical mechanisms are already in place in many organizations.

For rapid response to constituency problems, a simple telephone "hotline" is practical and convenient. Some agencies may already have a number used for emergencies or for obtaining help with other problems; it may be practical (and cost-effective) to also use this number for incident handling. It may be necessary to provide 24-hour coverage for the hotline. This can be done by staffing the answering center, by providing an answering service for non-office hours, or by using a combination of an answering machine and personal pagers.

If additional mechanisms for contacting the incident handling team can be provided, it may increase access and thus benefit incident handling efforts. A centralized e-mail address that forwards mail to staff members would permit the constituency to conveniently exchange information with the team.

One way to establish a centralized reporting and incident response capability, while minimizing expenditures, is to use an existing Help Desk. Many agencies already have central Help Desks for fielding calls about commonly used applications, troubleshooting system problems, and providing help in detecting and eradicating computer viruses. By expanding the capabilities of the Help Desk and publicizing its telephone number (or e-mail address), an agency may be able to significantly improve its ability to handle many different types of incidents at minimal cost.

No comments: