Protocol → Deliverables · Single Source of Truth

From protocol to study‑ready deliverables in days, not months.

DocumentSpark extracts the structured facts inside your protocol once, then generates every downstream data management, regulatory, and SDLC validation artifact your trial needs — traceable to the clause it derives from, and re-reconciled when the protocol changes.

28+
Deliverable types
6
Workflow phases
1
Source of truth
Aligned with
CDISC CDASH v2.2 CDISC SDTMIG v3.3 ICH E6 GCP SCDM GCDMP GAMP 5 FDA SDTCG DIA TMF RM
What happens today

A team of people, retyping the same protocol by hand.

When a protocol is approved, a dozen specialists — across the sponsor, the CRO, data management, and regulatory affairs — each open the same document and start writing their own set of deliverables from scratch. They are all drawing from the same source, but they work independently and in their own formats. The documents disagree with each other before the trial even begins, and every protocol amendment forces the whole team to start reconciling all over again.

A. Manual labor

The same document, authored a dozen times over.

Every team that touches the trial reads the protocol and writes their own set of documents from scratch. The same fact ends up retyped a dozen ways, in a dozen different formats — without any of the authors reading each other's work.

B. Inconsistency

The documents disagree from day one.

Because every author works independently, almost no two documents match. Catching the mismatches is a manual job — someone has to compare the documents line by line. The errors that slip through are discovered weeks or months later, often by an auditor.

C. Human error

One missed change cascades for the life of the trial.

When the protocol is amended, someone has to find every document the change touches, open it, edit it by hand, and re-route it for review. Miss a document and the inconsistency follows the trial into data collection — where it becomes expensive to unwind.

D. Time

Four to nine months before the first patient.

Writing, reviewing, and reconciling the document set sits squarely on the critical path between an approved protocol and enrolling the first patient. Every week added there is a week the trial isn't running.

4–9mo
Study setup time

Between an approved protocol and the first patient enrolled in a mid-sized trial.

28+
Documents per study

Stand-alone documents a study team has to write and keep in sync — entirely by hand.

37%
Cost of rework

Industry estimates of trial cost driven by mid-study changes and the manual reconciliation they trigger.

How it works

One protocol in. A complete study package out.

Five stages move a protocol from PDF to a deployment-ready package of CRFs, plans, specifications, and regulatory documentation — with every artifact linked to the protocol clause it derives from and re-checked whenever the protocol changes.

Stage 01
01

Ingest

Upload the protocol PDF. Manual CRF packets (DOCX) and REDCap data dictionaries (CSV) can be imported alongside.

Inputs · PDF · DOCX · CSV
Stage 02
02

Extract

A two-pass structured read turns the protocol into a typed fact base — the study arms, visit schedule, eligible population, treatments, and assessments. Every fact is reviewable against the clause it came from.

Verified · 2-pass extraction
Stage 03
03

Generate

Every downstream artifact is generated from the same fact base — CRFs, DMP, edit checks, SDTM mapping, consent forms, SDLC validation suite, TMF skeleton.

28+ artifacts · One source
Stage 04
04

Validate

Bidirectional traceability between protocol and artifacts. Quality audit runs deterministic plus structural checks; auto-repair resolves the issues that don't require human judgment.

Bidirectional · Auto-repair
Stage 05
05

Export

Deliver each artifact in the format its consumer expects — DOCX for sponsors, CDISC ODM XML for EDCs, Define-XML and a full ZIP for submission, CSV/XLSX for downstream tooling.

DOCX · ODM · Define · ZIP
Single source of truth

Every artifact is bound to the protocol clause that produced it.

The Protocol Intelligence workspace shows the structured model extracted from your protocol — arms, population, schedule, interventions — reviewable side-by-side with the source clause. Downstream generators consume the same fact base, so every deliverable inherits the same canonical numbers.

Full coverage

Twenty-eight artifacts. One fact base.

DocumentSpark generates the full set of artifacts required to operationalize a trial — from case report forms to the SDLC validation package to the submission-ready CDISC bundle. Each artifact is version-controlled, reviewable, and bound to the protocol clause it derives from.

Tier I

Study setup

2 artifacts
D-01
Study Design Configuration
SDC · arms · visits · randomization
D-02
Informed Consent Form
ICF · 6 sections, per-section regen
Tier II

Case Report Forms

4 artifacts
D-03
Case Report Forms
CRF · canonical form codes
D-04
Annotated CRF
aCRF · SDTM annotations
D-05
CRF Instruction Guide
per-form narrative guidance
D-06
Visit–Form Matrix
visit × form assignments
Tier III

Data Management SCDM GCDMP

8 artifacts
D-07
Data Management Plan
DMP · 18 GCDMP sections
D-08
Data Dictionary
DD · variable catalog, CT
D-09
Edit Check Specification
ECS · range · logical · derived
D-10
Data Review & Cleaning Plan
DRCP
D-11
SAE Reconciliation Procedure
SAERP
D-12
Database Lock Checklist
DBLC
D-13
External Data Transfer Spec
EDTS · per-vendor profiles
D-14
Medical Coding Specification
MCS · MedDRA · WHODrug · LOINC
Tier IV

SDTM Submission CDISC · FDA SDTCG

5 artifacts
D-15
SDTM Mapping Package
CDASH → SDTM · lookup-first
D-16
Define.xml
v2.1 (or v2.0 pre-2023)
D-17
SDTM Reviewer's Guide
SDRG
D-18
SUPPQUAL Templates
SUPP-- supplemental qualifiers
D-19
Submission ZIP Package
full bundle · validator-ready
Tier V

SDLC Validation GAMP 5

11 artifacts
D-20
User Requirements Spec
URS
D-21
Functional Requirements Spec
FRS
D-22
System Design Spec
SDS
D-23
Application Design Spec
ADS · target EDC
D-24
Audit Trail Design Spec
ATDS
D-25
Security & Access Controls
SACS
D-26
Internal Interface Spec
IIS
D-27
DB Physical Model
tables + attributes
D-28
Test Plan / List / Procedure
SDLC test suite
D-29
UAT Protocol
UAT · test cases + criteria
D-30
UAT Report
aggregated outcomes
Tier VI

Trial Master File DIA RM · ICH E6

2 scaffolds
D-31
TMF Zone Structure
DIA RM · 10 zones
D-32
Essential Documents Checklist
ICH E6 · before · during · after
Interoperable, EDC-agnostic

Deliver to any EDC, CDMS, or eTMF.

DocumentSpark sits upstream of your data-capture stack and exports each artifact in the format its consumer expects. Nothing locks you to a specific EDC vendor — the platform produces standards-bound, validator-ready files.

.xml CDISC ODM 1.3 CRFs · Medidata mdsol extensions
.xml Define-XML 2.1 v2.0 fallback pre-2023
.docx Microsoft Word DMP · URS · FRS · SDS · MCS · UAT · 14 more
.csv · .xlsx Tabular Data dictionary · edit checks
.pdf Annotated CRF aCRF · regulator-ready
.zip Submission Bundle SDTM · Define · SDRG · SUPPQUAL
Standards & compliance

Pinned versions. Date-resolved against the FDA catalog.

Each artifact conforms to the structural, terminological, and process standards that govern modern clinical trials. Submission targets (SDTM, Define-XML, controlled terminology) are date-resolved against the FDA Data Standards Catalog based on your study start date.

CDISC · CDASH
Clinical Data Acquisition Standards Harmonization
Pinned · IG v2.2 · 7-domain baseline
CRF fields and data collection structures conform to the CDASH model out of the box, grounded against the CDASH IG v2.2 baseline.
CDISC · SDTM
Study Data Tabulation Model
SDTM 1.7 · SDTMIG 3.3 · CT 2025-09
SDTM mapping is lookup-first against a 315-entry curated catalog; variables without a lookup entry are explicitly annotated NOT SUBMITTED rather than guessed.
CDISC · Define / ODM
Define-XML 2.1 · ODM 1.3
v2.0 / pre-2023 fallback via FDA catalog
Define.xml and ODM exports follow CDISC structure; ODM uses the Medidata mdsol vendor extensions where applicable.
ICH · E6 GCP
Good Clinical Practice
R2 / R3 (2025)
Generated artifacts preserve the traceability, version control, and audit posture required by ICH E6.
SCDM · GCDMP
Good Clinical Data Management Practices
Per-chapter (2023 / 2025)
DMP, edit checks, DRCP, and SAE reconciliation follow GCDMP guidance — the 18-section DMP carries per-section excerpts as the generation contract.
ISPE · GAMP 5
Good Automated Manufacturing Practice
Framework for SDLC suite
URS, FRS, SDS, ADS, ATDS, SACS, IIS and the full test suite are produced under a GAMP 5 framework with bidirectional traceability.
FDA · SDTCG
Study Data Technical Conformance Guide
May 2023
Submission-bound artifacts align with FDA's technical conformance requirements for study data. Versions are resolved from the FDA Data Standards Catalog by study start date.
FDA · CSA & CSV
Computer Software Assurance
Cited in ADS · ATDS · SDTM
Validation deliverables incorporate FDA computer-systems guidance (2007 + Oct 2024 Q&A) and CSA-aligned risk-based test rationale.
DIA · TMF RM
DIA Reference Model
10-zone structure
The TMF scaffold is seeded against the DIA Reference Model, with the ICH E6 essential-document checklist organized into before / during / after trial categories.
Governance · 01

Bidirectional traceability

Every artifact carries a back-reference to the protocol clauses it depends on; validation runs both ways — protocol → artifact and artifact → protocol.

Governance · 02

Version-controlled artifacts

Integer-versioned artifacts with immutable per-version history. Lifecycle moves draft → in review → pending approval → approved, with superseded on amendment.

Governance · 03

Recorded approval

Approval requires password re-confirmation; each approval persists a signature timestamp and signature meaning against the artifact version being signed.

Governance · 04

Immutable change history

Every state change writes to an immutable change log — user, action, resource, before/after — exportable for sponsor governance and oversight.

Built for the full trial team

One platform, every role in the study build.

From sponsor to CRO to regulatory affairs, DocumentSpark replaces fragmented authoring tools with a single environment that produces every deliverable each function depends on — without locking the trial to a single EDC or vendor.

01

Sponsors

Compress study start-up timelines, reduce vendor dependencies, and maintain a single source of truth across portfolios. Hand a complete, validated package to any CRO or EDC.

Sponsor leadership · Program management
02

CROs

Standardize study build across clients and EDCs. Move faster on competitive bids by reducing the cost of upfront authoring work; compress UAT by validating against the source protocol.

Operations · Study delivery
03

Clinical Data Managers

Author CRFs, edit checks, and the DMP from a structured protocol model instead of free text. Catch amendment impact before it reaches UAT through traceability and the staleness flag.

Data management · Biostatistics
04

Regulatory, QA & PIs

Inherit traceability and version control by default. Submit study data aligned to the FDA conformance guide without rework; investigators see consent and assessments derived from the protocol they approved.

Regulatory · Quality assurance · PIs
Request a demo

See your protocol generate a study package.

Bring a redacted protocol or a representative document from a recent trial. We will walk you through the structured extraction, the generated deliverables, and the traceability model — live, against your own content.

Typical session
45 minutes · remote
For best fit
Sponsor, CRO, or DM lead
Materials
Redacted protocol (optional)
Follow-up
Sample deliverable package

We respond within one business day. Materials shared with us are treated as confidential and are not retained without explicit authorization.