Last Updated: March 1, 2026

Data Protection Impact Assessment (DPIA)


1. Introduction

This Data Protection Impact Assessment (DPIA) is conducted in accordance with GDPR Article 35 to evaluate the data protection risks arising from SoDNAscan's processing of genetic, health, and biometric data using artificial intelligence.

This assessment is maintained by the data controller and is available upon request to supervisory authorities and, in summary form, to data subjects upon request at info@sodnascan.com.

This document should be read together with our:


2. Data Controller

No Data Protection Officer (DPO) has been formally appointed. Under GDPR Article 37, a DPO is required for public authorities or organizations whose core activities require large-scale, regular, and systematic monitoring or large-scale processing of special category data. SoDNAscan will appoint a DPO if processing scale reaches a threshold that triggers this obligation.


3. DPIA Triggers

Three independent criteria under GDPR Article 35 and the European Data Protection Board (EDPB) Guidelines on Data Protection Impact Assessment (WP 248 rev.01) trigger the requirement for this DPIA. Any two are sufficient; SoDNAscan meets all three:

# Trigger Applicable GDPR Provision How SoDNAscan Meets It
1 Processing special category data at scale Art. 35(3)(b), Art. 9 Genetic data (DNA genotypes) and health data (blood biomarkers, wearable metrics, health history) are special category data under Art. 9. Processing occurs systematically for all users who upload data.
2 Automated decision-making or profiling with significant effect Art. 35(3)(a), Art. 22 AI-powered analysis generates personalized health assessments, risk profiles, supplement protocols, and monitoring plans based on genetic and health data. While outputs are informational (not clinical decisions), they may influence users' health behaviors.
3 Use of new technologies EDPB Guidelines, Criterion 8 Large language models (LLMs) processing genetic data for individualized health analysis represents a novel application of AI technology to special category data.

4. Description of Processing

4.1 Purpose

SoDNAscan generates personalized health books by analyzing users' genetic data, blood work results, and wearable health metrics using AI. The service provides wellness insights and educational health information — not medical diagnoses or clinical recommendations.

4.2 Data Subjects

Adult individuals (18+) who voluntarily create an account, upload their genetic data, and consent to AI-powered analysis.

4.3 Categories of Personal Data Processed

Data Category GDPR Classification Source
Account data (email, name, password hash) Ordinary personal data User-provided at registration
Demographic data (age, sex, height, weight, ethnicity) Ordinary personal data User-provided in profile
Genetic data (SNP genotypes — rsid, chromosome, position, alleles) Special category data — Art. 9 (genetic data) Uploaded DNA file
Blood work results (biomarker names, values, units, reference ranges, status flags) Special category data — Art. 9 (health data) Uploaded PDF or pasted text
Wearable health metrics (heart rate, HRV, SpO2, sleep, activity, body composition) Special category data — Art. 9 (health data) Uploaded Apple Health, Oura, Fitbit, or Whoop export
Self-reported health information (health history, family history, goals, supplements, lifestyle) Special category data — Art. 9 (health data) User-provided free-text fields
AI-generated health analysis (reports, chapters, fact sheets) Derived special category data Generated by AI processing
Payment data (Stripe session ID, payment intent, amount, currency) Ordinary personal data Stripe checkout
Consent records (consent type, granted/withdrawn, timestamp, policy version, IP address) Ordinary personal data System-generated

4.4 Processing Activities

The processing pipeline consists of four sequential stages:

Stage 1 — Data Ingestion

Stage 2 — Supplementary Data (Optional)

Stage 3 — AI Analysis and Book Generation

Stage 4 — Delivery and Storage

4.5 Recipients and Processors

Recipient Role Data Received Location DPA/SCCs
Anthropic, PBC Data processor Matched genetic variants, confirmed blood biomarkers, aggregated wearable metrics, sanitized profile fields United States (AWS/GCP) DPA with SCCs in commercial API terms
Supabase, Inc. Data processor All stored data (database, files, auth) EU West — Frankfurt DPA with SCCs (signed via PandaDoc)
Stripe, Inc. Data processor Email, user ID, payment amount, currency Global (including US) DPA with SCCs; PCI DSS Level 1
Resend Data processor Email address, user name United States DPA

Sub-processor chain: SoDNAscan (controller) → Anthropic/Supabase/Stripe (processors) → AWS/GCP (infrastructure sub-processors). Under GDPR Article 28, SoDNAscan remains fully liable for the data protection obligations of all processors and sub-processors.


5. Legal Basis

5.1 Legal Basis for Processing

Data Category Legal Basis
Genetic data, blood work, wearable data, self-reported health information Art. 9(2)(a) — Explicit consent of the data subject
Account data, demographic data Art. 6(1)(b) — Performance of contract
Payment data Art. 6(1)(b) — Performance of contract
Server logs Art. 6(1)(f) — Legitimate interest (security)
Consent records Art. 6(1)(c) — Legal obligation

5.2 Consent Mechanism

Consent for processing special category data is obtained through:

  1. Signup flow: Separate, explicit checkbox for health data processing consent with a direct link to the Data Use Policy. This is distinct from the Terms of Service acceptance and disclaimer acknowledgment.
  2. Consent records: Each consent event (granted or withdrawn) is recorded with timestamp, policy version, and IP address in an immutable audit trail.
  3. Withdrawal: Users can withdraw AI processing consent at any time via account settings. Withdrawal immediately blocks new AI processing but does not affect previously generated content.
  4. Consent gating: Backend endpoints that trigger AI processing enforce active consent — requests are rejected if consent is not currently granted.

5.3 Necessity and Proportionality


6. Risk Assessment

6.1 Identified Risks

# Risk Likelihood Severity Overall Risk
R1 Unauthorized access to genetic data through application breach Low Very High High
R2 Unauthorized access to genetic data through processor breach (Anthropic, Supabase) Low Very High High
R3 AI generates inaccurate health information that users act upon Medium High High
R4 Cross-border transfer exposes genetic data to US government access requests Low High Medium
R5 Re-identification of anonymized genetic data Very Low Very High Medium
R6 Genetic data reveals information about non-consenting family members Medium Medium Medium
R7 AI prompt injection via user-provided text fields Low Medium Low
R8 Consent is not sufficiently informed or specific for Art. 9 data Low High Medium
R9 Data retained longer than necessary Low Medium Low
R10 Sub-processor processes data beyond authorized scope Very Low High Low

6.2 Severity Criteria


7. Risk Mitigation Measures

7.1 R1 — Application Breach

Measure Implementation
Authentication security JWT with ES256 algorithm, verified against Supabase JWKS. Refresh tokens in httpOnly cookies (invisible to JavaScript). Access tokens in memory only (not localStorage).
CSRF protection Custom header required on auth endpoints
Row-Level Security RLS enabled on all database tables. All application queries filter by authenticated user ID.
Security headers HSTS, X-Frame-Options DENY, X-Content-Type-Options nosniff, strict Referrer-Policy, Content Security Policy
Rate limiting Global rate limits plus per-endpoint overrides for auth and upload endpoints
File validation Magic byte verification, extension whitelisting, size limits, format-specific content validation
Input sanitization Free-text fields stripped of special characters before AI prompt interpolation
No third-party tracking Self-hosted cookie-free analytics only (no personal data collected). No third-party analytics scripts, no tracking cookies, no advertising SDKs.

7.2 R2 — Processor Breach

Measure Implementation
Anthropic retention limit 7-day retention window, then automatic deletion. No model training on API data.
Anthropic DPA Data Processing Addendum with SCCs incorporated in commercial terms
Supabase encryption Database and storage encrypted at rest (AES-256). EU West Frankfurt deployment. DPA with SCCs.
Stripe isolation Payment processor receives email and user ID only — no genetic, health, or biometric data. PCI DSS Level 1.
Encryption in transit All data transmission uses TLS 1.2 or higher

7.3 R3 — Inaccurate AI Output

Measure Implementation
Disclaimer framework Medical & Wellness Disclaimer required at signup. "Not medical advice" disclosures in Health Book content and Data Use Policy.
Validation pipeline Rule-based validator checks SNP coverage, allele consistency, confidence distribution, and supplement inference chain lengths. Semantic validator checks cross-report consistency.
Evidence-tier system SNP reference database includes evidence tiers and confidence scores. AI system prompt requires citing evidence quality for each finding.
Blood work user verification Extracted biomarkers must be explicitly confirmed by the user before entering AI analysis
Human oversight disclosure Data Use Policy Section 10 clearly states that AI outputs are not reviewed by medical professionals

7.4 R4 — Cross-Border Transfer Risk

Measure Implementation
Standard Contractual Clauses SCCs in place with Anthropic, Supabase, and Stripe
Transfer Impact Assessment Separate TIA conducted. Available upon request.
Encryption TLS 1.2+ in transit. AES-256 at rest.
Retention limitation 7-day retention at Anthropic limits the exposure window
EU data residency Supabase deployed in EU West Frankfurt — stored data does not leave the EU

7.5 R5 — Re-identification Risk

Measure Implementation
No data sharing Genetic data is never sold, shared, or combined across users
No public datasets Generated health books are private to the user
Access isolation RLS and per-user query filtering prevent any cross-user data access

7.6 R6 — Family Member Privacy

Measure Implementation
User notice Privacy Policy Section 12 explicitly informs users that genetic data reveals information about biological relatives
Data subject scope Only the uploading individual's data is processed; no family member data is collected or inferred
No familial matching The service does not perform relative matching, ancestry tracing, or cross-user genetic comparison

7.7 R7 — Prompt Injection

Measure Implementation
Input sanitization All free-text profile fields are stripped of special characters before inclusion in AI prompts
Field length limits Free-text fields truncated to 1,000–2,000 characters
XML containment Blood work report text is wrapped in containment tags with system instructions to treat contents as raw data

7.8 R8 — Consent Quality

Measure Implementation
Separate consent Health data processing consent is a standalone checkbox, distinct from Terms of Service and disclaimer
Linked policy Consent checkbox links directly to the Data Use Policy
Consent audit trail Immutable records: consent type, granted/withdrawn, timestamp, policy version, IP address
Withdrawal mechanism Settings page toggle for immediate consent withdrawal; backend enforces at the endpoint level
Re-consent on changes Privacy Policy commits to requesting renewed consent if processing changes materially affect genetic or health data

7.9 R9 — Data Retention

Measure Implementation
Wearable data minimisation Raw wearable files are deleted immediately after parsing. Only aggregated metrics are retained.
Account deletion cascade Full cascading deletion of all user data across all tables and storage buckets
Anthropic auto-deletion 7-day retention window with automatic deletion
Defined retention schedule Retention periods documented in Privacy Policy Section 8 for all data categories

7.10 R10 — Sub-Processor Scope Creep

Measure Implementation
DPA terms Each processor is bound by a DPA specifying permitted processing purposes
No-training guarantee Anthropic's commercial terms prohibit model training on API data
Periodic review Processor terms and sub-processor lists to be reviewed annually
Transparency Sub-processor chain is disclosed in Privacy Policy Section 6

8. Residual Risks

After implementing the mitigation measures above, the following residual risks remain:

Risk Residual Level Justification
R1 — Application breach Low Standard security controls in place; no system is immune to zero-day vulnerabilities
R2 — Processor breach Low Mitigated by DPAs, encryption, and retention limits; residual risk inherent in any cloud processing
R3 — Inaccurate AI output Medium AI model limitations are inherent. Mitigated by disclaimers, validation, and user responsibility disclosure. Cannot be fully eliminated.
R4 — Cross-border transfer Low SCCs and encryption provide adequate safeguards under current CJEU jurisprudence (Schrems II)
R6 — Family implications Medium Inherent to genetic data. Adequately disclosed to users. Cannot be technically eliminated.

No residual risk is assessed as high after mitigation. Processing may proceed.


9. Consultation

9.1 Supervisory Authority

GDPR Article 36 requires prior consultation with the supervisory authority if the DPIA indicates that processing would result in a high risk that the controller cannot mitigate. Based on this assessment, residual risks have been mitigated to acceptable levels and prior consultation is not required.

If the risk profile changes materially (e.g., new processing activities, changes to Anthropic's retention terms, or expansion of data categories), this assessment will be re-evaluated and prior consultation will be sought if necessary.

9.2 Data Subjects

Users are informed of the processing through:


10. Review Schedule

This DPIA will be reviewed and updated:


11. Conclusion

This DPIA confirms that SoDNAscan's processing of genetic and health data using AI:

Processing may proceed subject to ongoing compliance monitoring and the review schedule above.


12. Contact

For questions about this DPIA or to request the full document: