CAIDA Home
 Macroscopic Topology | IMDC | COMMONS | Network Telescope | Ark | Day in the Life | Coralreef | IPNC  
 www.caida.org > projects : : predict
    visit     contact     search:
CAIDA: Cooperative Association for Internet Data Analysis
The DHS PREDICT Project

-----summary of contents-----
This page describes CAIDA's participation and contributions to the DHS Protected REpository for the Defense of Infrastructure against Cyber Threats (PREDICT) project. This project currently receives support from the DHS contract NBCHC070133 "Supporting Research and Development of Security Technologies through Network and Security Data Collection."
-----end summary of contents-----

PREDICT Project Overview

Researchers require current data on Internet security threats, including samples of normal and malicious Internet traffic, malicious software samples, and logs from machines compromised in targeted attacks, and other data to develop hardware and software that protects against and mitigates the effects of hacking attempts and malicious software. Concerns over privacy, security, proprietary information, and legal risks make collection and distribution of such data difficult for the owners of the infrastructure, owners of data, collectors of data, and distributors of data. Thus, few organizations make datasets available for the development and testing of defensive technologies.

The Department of Homeland Security (DHS) has developed the Protected Repository for the Defense of Infrastructure Against Cyber Threats (PREDICT) project to provide vetted researchers with current network operational data in a secure and controlled manner that respects the security, privacy, legal, and economic concerns of Internet users and network operators. You can learn more about PREDICT in the Overview of the PREDICT program (DHS.gov PDF document).

The DHS established the PREDICT project to meet three primary goals.

  1. To provide a Web-based portal to vetted researchers that catalogs current computer network and operational data and provides data request infrastructure.
  2. To provide secure access to multiple sources of data collected as a result of use of and traffic on the Internet.
  3. To facilitate data flow among PREDICT participants for the purpose of developing new models, technologies and products that support effective threat assessment and increase cyber security capabilities.

CAIDA's Role in the PREDICT Project

CAIDA has been involved with the development of the PREDICT program since its inception; CAIDA personnel have served in an advisory capacity on all committees developing and implementing PREDICT processes and procedures. CAIDA participates in the PREDICT program as a Data Provider via the collection of routing data, peering point passive traces, and denial-of-service attack and Internet worm data from the UCSD Network Telescope. CAIDA is also Data Host, serving that data to researchers who have been vetted and approved through the PREDICT program. Through its Data Host and Data Provider roles, a CAIDA representative will serve on the PREDICT Application Review and Publication Review Boards that involve data that CAIDA collects or distributes.

Major project activities include:

  • collection, documentation, anonymization, and distribution of routing, peering point, and UCSD Network Telescope data,
  • continuing to advise on technical, legal, and practical aspects of PREDICT policies and procedures, and
  • creating an index of anonymization techniques and advantages and pitfalls of using them on Internet datasets.

Data Sets

CAIDA released the following datasets with support from DHS. The creation of some of these data sets was cost-shared with other federal agency and private sector funding sources.

We developed the CAIDA Data FAQ to address questions from researchers.

We also provide information on data sharing and anonymization, including a list of relevant papers.

Previous funding

Previous PREDICT-related efforts were funded under DHS contract NBCHC040159: "Network Traffic Data Repository to Develop Secure IT Infrastructure."


Cooperative Association for Internet Data Analysis (CAIDA)
  Last Modified: Fri Oct-24-2008 15:57:58 PDT
  Maintained by: Alex Ma
  Page URL: http://www.caida.org/projects/predict/index.xml