Investigative Data Warehouse
Encyclopedia
The Investigative Data Warehouse, or IDW, is a searchable database
Database
A database is an organized collection of data for one or more purposes, usually in digital form. The data are typically organized to model relevant aspects of reality , in a way that supports processes requiring this information...

 operated by the FBI. It was created in 2004. Much of the nature and scope of the database is classified
Classified information
Classified information is sensitive information to which access is restricted by law or regulation to particular groups of persons. A formal security clearance is required to handle classified documents or access classified data. The clearance process requires a satisfactory background investigation...

. The database is a centralization of multiple federal and state databases, including criminal record
Criminal record
A criminal record is a record of a person's criminal history, generally used by potential employers, lenders etc. to assess his or her trustworthiness. The information included in a criminal record varies between countries and even between jurisdictions within a country...

s from various law enforcement agencies, the U.S. Department of the Treasury’s Financial Crimes Enforcement Network
Financial Crimes Enforcement Network
The Financial Crimes Enforcement Network is a bureau of the United States Department of the Treasury that collects and analyzes information about financial transactions in order to combat money laundering, terrorist financiers, and other financial crimes.As reflected in its name, the Financial...

 (FinCEN), and public records
Public records
Public records are documents or pieces of information that are not considered confidential. For example, in California, when a couple fills out a marriage license application, they have the option of checking the box as to whether the marriage is "confidential" or "Public"...

 databases. According to Michael Morehart's testimony before the House Committee on Financial Services in 2006, the "IDW is a centralized, web-enabled, closed system repository for intelligence and investigative data. This system, maintained by the FBI, allows appropriately trained and authorized personnel throughout the country to query for information of relevance to investigative and intelligence matters."Morehart 2005, op. cit.

Overview

The size of the database appears to be growing rapidly. In 2004, according to a government solicitation for bids to manage the project, it was approximately 10TB
Terabyte
The terabyte is a multiple of the unit byte for digital information. The prefix tera means 1012 in the International System of Units , and therefore 1 terabyte is , or 1 trillion bytes, or 1000 gigabytes. 1 terabyte in binary prefixes is 0.9095 tebibytes, or 931.32 gibibytes...

 in size. In 2005, according to one FBI official, the IDW contained approximately 100 million documents. In 2006 it contained more than 560 million documents and was accessible by more than 12,000 individuals. According to the FBI's website, as of August 22, 2007, the database contained 700 million records from 53 databases and was accessible by 13,000 individuals around the world.

, the FBI is the subject of a lawsuit brought by the EFF (Electronic Frontier Foundation
Electronic Frontier Foundation
The Electronic Frontier Foundation is an international non-profit digital rights advocacy and legal organization based in the United States...

) because of a lack of public notice describing the database and the criteria for including personal information, as required by the Privacy Act of 1974
Privacy Act of 1974
The Privacy Act of 1974, 5 U.S.C. § 552a, Public Law No. 93-579, establishes a Code of Fair Information Practice that governs the collection, maintenance, use, and dissemination of personally identifiable information about individuals that is maintained in systems of records by federal agencies...

. The lawsuits are a result of two Freedom of Information Act requests filed by the EFF in 2006.

It was built in part by Chiliad corporation, the FBI Office of the Chief Technology Officer, and others. Companies listed on the FOIA files include Northrup Grumman and others.

Purpose

Investigative Data Warehouse-Secret, "provides data and data processing/analysis services to FBI agents and analysts as they perform counter-terrorism, counter-intelligence, and law enforcement missions". The Core Subsystem supports the Counter-Terrorism Division (CTD), the Special Event Unit, and via DOCLAB-S, the Joint Intelligence Committee Investigation (JICI) and IntelPlus.

According to a 2005 email, "IDW will also be used for criminal and other authorized non-CT investigations as it evolves." (CT being Counter Terrorism)

Subsystems

Within the system, there were subsystems named IDW-S Core, SPT, and DOCLAB-S

SPT

SPT stands for Special Projects Team. It
allows for the rapid import of new specialized data sources. These data sources are not made available to the general IDW users but instead are provided to a small group ofusers who have a demonstrated "need-to-know". The SPT System is similar in function to the IDW-S system, with the main difference is a different set of data sources. The SPT System allows its users to access not only the standard IDW Data Store but the specialized SPT Data Store.

Privacy

According to internal emails, the FBI performed several Privacy Impact Assessments (PIAs) of the IDW system. They worked with lawyers from their National Security Law Branch (NSLB) to attempt to make sure their system was complying with various laws regarding sharing of information and secrecy (for example, rule 6e of the Federal Rules of Criminal Procedure, regarding the secrecy of Grand Jury material)

The Information Sharing Policy Group (ISPG) formed a Discretionary Access Control Team (DACT), to work on "approval of data sets" and "access control requirements" for IDW and DataMart, and responding to other Intelligence Community agencies requesting access.

The EFF FOIA IDW website states "Despite the vast amount of personal information contained in the IDW, the FBI has never published a Privacy Act notice describing the system or explaining the ways in which the records might be used."

There was also a 2005 email from someone on the Office of General Council (OGC) about "preliminary staff musings that maybe we should limit FBI PIA requirements to non-NS systems" (NS being National Security). There was also an email from 2006 saying that 'national security systems are exempt from E-Gov', apparently referring to the E-Government Act of 2002
E-Government Act of 2002
The E-Government Act of 2002 , is a United States statute enacted on December 17, 2002, with an effective date for most provisions of April 17, 2003...

, which has a section that deals with privacy.

Data sources

The IDW used many data sources. The FOIA documents from EFF are heavily redacted, but some of the sources are as follows:
  • FBI Automated Case Support system (ACS) , subset of the Electronic Case File (ECF) system
  • Joint Intelligence Committee Investigation documents (JICI) , with OCR text
  • "Open Source News" (public websites, such as the Washington Post and others)
  • Secure Automated Messaging Network (SAMNet)
  • Violent Gang and Terrorist Organizing File (VGTOF)
  • DARPA TIDES program
    DARPA TIDES program
    TIDES is an ambitious technology development effort, funded by DARPA. It stands for Translingual Information Detection, Extraction and Summarization. It is focused on the automated processing and understanding of a variety of human language data...

     ('open source news' that has been organized and collected)
  • IntelPlus Filerooms, with OCR text
  • FBI National Crime Information Center (NCIC)
  • FBI Records Management Division (RMD), Document Laboratory (DocLab), FBIHQ
  • MiTAP
    MiTAP
    MiTAP, or Mitre Text and Audio Processing, is a computer system that tries to automatically gather, translate, organize, and present information "for monitoring infectious disease outbreaks and other global events." It is also used in the FBI Investigative Data Warehouse.Sources"Multiple...

      (collects data from public sources, websites, etc)
  • SPT-Specific data sources (partial list, FOIA files have large parts redacted):
    • Unified Name Index (UNI) extracts
    • Financial Center (FinCen), including Bank Secrecy Act
      Bank Secrecy Act
      The Bank Secrecy Act of 1970 requires financial institutions in the United States to assist U.S. government agencies to detect and prevent money laundering...

       data
    • "Various Sources", including the Transportation Security Administration
      Transportation Security Administration
      The Transportation Security Administration is an agency of the U.S. Department of Homeland Security that exercises authority over the safety and security of the traveling public in the United States....

    • FBI Counterterrorism Division (CTD)
    • Telephone numbers / addresses from ACS
    • Case data from ACS
    • Terrorist Watch List (TWL)
    • "Other NJTTF data"
    • DoS ... Lost/Stolen Passport data
    • No Fly List
      No Fly List
      The No Fly List is a list, created and maintained by the United States government's Terrorist Screening Center , of people who are not permitted to board a commercial aircraft for travel in or out of the United States. The list has also been used to divert away from U.S. airspace aircraft not...

      , from TSA
    • Selectee
      Secondary Security Screening Selection
      Secondary Security Screening Selection or Secondary Security Screening Selectee, known by its acronym SSSS, is an airport security measure in the United States and Canada which selects passengers for additional inspection. This may also be known as Selectee, Automatic Selectee or the Selectee list...

       list, from TSA
    • ACS/ECF with some case types excluded
    • CIA
      Central Intelligence Agency
      The Central Intelligence Agency is a civilian intelligence agency of the United States government. It is an executive agency and reports directly to the Director of National Intelligence, responsible for providing national security intelligence assessment to senior United States policymakers...

       non-TS/non-SCI Technical Discussions (TDs) and Intelligence Information Reports (IIRs) from 1978 to the May 2004


There was also talk of linking the FTTTF "Data Mart" with IDW.

The data in IDW is classified at the 'Secret' level or lower. Higher classifications are not allowed, and can be removed

See Also

  • Open source intelligence
    Open source intelligence
    Open-source intelligence is a form of intelligence collection management that involves finding, selecting, and acquiring information from publicly available sources and analyzing it to produce actionable intelligence...

  • FBI Index
    FBI Index
    The FBI Indexes were a series of personnel databases used by the FBI before the adoption by the Bureau of computerized databases. They were based on paper index cards. They were used to track US citizens and others believed by the Bureau to be dangerous to national security...

  • DCSNet
    DCSNet
    DCSNet, an abbreviation for Digital Collection System Network, is the FBI's point-and-click surveillance system that can perform instant wiretaps on almost any communications device in the US....

  • Project SHAMROCK
    Project SHAMROCK
    Project SHAMROCK, considered to be the sister project for Project MINARET, was an espionage exercise that involved the accumulation of all telegraphic data entering into or exiting from the United States...

     (NSA)
  • Project MINARET
    Project MINARET
    Project MINARET was a sister project to Project SHAMROCK operated by the NSA, which, after intercepting electronic communications that contained the names of predesignated US citizens, passed them to other government law enforcement and intelligence organizations...

    (NSA)
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK