> For the complete documentation index, see [llms.txt](https://datapump.docs.otised.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://datapump.docs.otised.com/readme.md).

# Introduction

Manifest-driven data extraction tool that pulls data from database systems and delivers it to secure cloud storage.

**OtisEd DataPump** is a manifest-driven data extraction tool that automates the process of pulling data from your databases and delivering it to secure cloud storage. You define what data to extract using manifests -- instruction documents containing SQL queries, output formatting options, and delivery targets. DataPump handles the rest: connecting to your database, executing each query, writing the results to delimited files (TSV, CSV, or pipe-delimited), and uploading everything to one or more cloud destinations including AWS S3, Azure Blob Storage, Google Cloud Storage, or SFTP servers. Alongside the extracted data, DataPump generates DDL and metadata format files that describe the structure and data types of the source data, enabling target systems to automatically create matching tables for ingestion.

DataPump supports SQL Server, Oracle, Snowflake, and ODBC data sources, and can process multiple manifests in parallel. Each extraction run generates a submission ticket that records row counts, file checksums, timing information, and any errors or warnings -- giving you a complete audit trail of what was extracted and delivered. Optional system columns such as row hashes, sequential row numbers, and batch identifiers can be appended to your data for deduplication, change detection, and traceability in downstream systems.

DataPump is a component in the OtisEd data pipeline ecosystem. **Zipline** provides centralized monitoring, configuration, and execution management for batch ETL jobs. The **DataReceiver** consumes DataPump output and ingests it into target systems including the iMart Data Lake, iMart Data Warehouse, SLDS systems, Ed-Fi Data Warehouse, and CEDS Data Warehouse. Together, these tools provide an end-to-end pipeline from source database extraction through cloud delivery to data warehouse ingestion.

***

Browse the documentation using the sidebar navigation.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://datapump.docs.otised.com/readme.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
