EDA

Data Sources

The Epstein Document Archive sources all 207,251 documents exclusively from official U.S. government releases. Below is a complete list of data sources, the number of documents from each, and descriptions of what they contain. We do not alter, fabricate, or editorialize any content.

8
Source Categories
207,251
Total Documents
100%
Government Sources

All Sources

DOJ Epstein Library - Court Records

12,521 documents(6.0% of archive)
Source

Court records from the Jeffrey Epstein investigation released through the DOJ Epstein Library. Includes indictments, plea agreements, motions, orders, and related filings from both the Southern District of Florida and the Southern District of New York.

DOJ - EFTA Photos

7,607 documents(3.7% of archive)
Source

Photographs transferred via the Electronic File Transfer Application (EFTA) from the DOJ. These include evidence photos from properties, items, and locations related to the investigation.

DOJ - Manual Photos

7,683 documents(3.7% of archive)
Source

Manually uploaded photographs from the DOJ release. These complement the EFTA photos and include additional evidence images, property photos, and investigation materials.

DOJ - Unclassified Materials

2,632 documents(1.3% of archive)
Source

Unclassified materials from the DOJ release that do not fall into the other specific categories. Includes miscellaneous investigation documents, reports, and correspondence.

DOJ - Data Sets 1-5

5,703 documents(2.8% of archive)
Source

Five structured data sets released by the DOJ containing additional investigation records. These data sets were released in batches and include various document types including financial records, communications, and investigative notes.

House Oversight Committee

1,124 documents(0.5% of archive)
Source

Documents released by the U.S. House Oversight Committee related to the Epstein investigation. These include committee reports, transcripts, and related materials from congressional oversight activities.

FBI Vault

850 documents(0.4% of archive)
Source

FBI investigation files obtained from the FBI Vault, the FBI's electronic reading room. These include FBI reports, memos, and investigative materials related to Epstein that have been declassified and released.

FOIA Releases

600 documents(0.3% of archive)
Source

Documents obtained through Freedom of Information Act (FOIA) requests. These include records from various federal agencies that were released in response to public records requests related to Epstein.

Data Integrity

Every document in the archive is traceable to its official government source. We maintain SHA-256 file hashes for all original documents to ensure integrity. The processing pipeline (OCR, text extraction, entity recognition) is documented in our methodology page.

Frequently Asked Questions

Where do the documents in the archive come from?
All 207,251 documents are sourced from publicly available U.S. government releases. The primary source is the DOJ Epstein Library, supplemented by FBI Vault files, House Oversight Committee releases, and FOIA disclosures. No documents are fabricated, altered, or sourced from unofficial channels.
Are these documents authentic?
Yes, all documents are sourced directly from official U.S. government agency releases. We maintain chain-of-custody records for each document showing the source URL, download date, and any processing steps applied (OCR, text extraction). Original PDF files are preserved and linked from each document page.
Are any documents redacted?
Some documents contain redactions applied by government agencies before release. These redactions are preserved as-is. We do not add or remove any redactions. Redacted sections are noted where they affect the readability or completeness of a document.
How quickly are new releases processed?
New government releases are typically processed and added to the archive within 48-72 hours. The process involves downloading, OCR processing, text extraction, entity recognition, and indexing. Large batches may take longer.