Security & Data Practices

Last Updated: April 18, 2026

Purpose: This document provides technical details about SentraCheck's security architecture and data handling practices. It is intended to support insurance applications, compliance audits, and due diligence reviews.

1. Data Flow Architecture

SentraCheck processes documents through a stateless, ephemeral pipeline designed to minimize data exposure:

[User Browser] --HTTPS/TLS 1.3--> [Load Balancer]
    |
    v
[API Gateway (PHP/Apache)] --> [Auth Service] (JWT validation)
    |
    v
[Job Queue (MySQL)] --> [Background Worker] (systemd service)
    |
    +--> [PII Detection Engine] (Anthropic Claude API — text only)
    +--> [ADA Accessibility Checker] (self-hosted rules)
    +--> [PDF Conversion Engine] (Anthropic Claude API — text only)
    +--> [OCR Engine] (self-hosted Tesseract)
    |
    v
[Findings stored encrypted] --> [User Dashboard]

* Uploaded document files stored temporarily; auto-deleted after 24 hours by default (configurable)
* Scan findings stored in encrypted database — accessible only to the submitting user/org
* Original document content is never sent to external APIs — only extracted text

1.1 Document Upload Flow

User uploads document via browser (HTTPS/TLS 1.3)
Document received in memory at API gateway
JWT token validated against auth service
Document streamed to ephemeral processing container
Analysis performed entirely in memory
Results returned to user
Processing container terminated; memory zeroed

1.2 Data at Rest

SentraCheck stores the minimum data necessary to operate the service. What is stored persistently:

Account data: User profiles, organization info, billing records
Scan findings: Issue metadata (type, severity, page location) stored encrypted — document content is never stored
Document files: Uploaded files retained for 24 hours by default (configurable per organization), then securely deleted
Converted HTML: PDF conversion output stored and accessible via viewer URL until deleted by the user
Audit logs: Authentication events, API access logs (no document content)

2. Data Retention Schedule

Data Type	Retention Period	Storage Location
Uploaded Document Files	24 hours (default, configurable)	Encrypted file storage (US), auto-deleted
Document Content / Raw Text	Never stored	Used in-memory for analysis only
Scan Findings (issue type, severity, location)	Duration of account	Encrypted database (US) — no document content stored
Converted HTML Output	Until deleted by user	Encrypted file storage (US)
Account Data	Duration + 30 days	Encrypted database (US)
Authentication Logs	90 days	Encrypted log storage (US)
API Access Logs	90 days	Encrypted log storage (US)
Aggregate Usage Stats	2 years	Analytics database (US)
Billing Records	7 years	Financial systems (US)

3. Hosting Infrastructure

3.1 Primary Infrastructure

Component	Provider	Location	Certifications
Application Servers	AWS	US-West-1 (California)	SOC 2, ISO 27001, FedRAMP
Database	AWS RDS	US-West-1 (California)	SOC 2, ISO 27001
CDN / DDoS Protection	Cloudflare	Global Edge	SOC 2, ISO 27001
DNS	Cloudflare	Global	DNSSEC enabled

3.2 Data Residency

All persistent data is stored exclusively in the United States, specifically AWS US-West-1 (N. California) region. Data does not leave the United States. Document processing occurs in US-based infrastructure. No customer data is transferred to or processed in other jurisdictions.

4. Encryption Standards

Layer	Method	Key Management
Data in Transit	TLS 1.3 (minimum TLS 1.2)	Auto-rotated certificates (Let's Encrypt / AWS ACM)
Data at Rest (Database)	AES-256	AWS KMS (customer-managed keys available)
Data at Rest (Backups)	AES-256	AWS KMS
Data at Rest (Logs)	AES-256	AWS KMS
Password Storage	bcrypt (cost factor 12)	Per-user salt, adaptive hashing
Session Tokens	Cryptographic tokens	Database-backed with secure random generation

4.1 TLS Configuration

Minimum protocol: TLS 1.2 (TLS 1.3 preferred)
Cipher suites: ECDHE+AESGCM, ECDHE+CHACHA20
HSTS enabled with 1-year max-age
Certificate transparency logging enabled
OCSP stapling enabled

5. AI & Machine Learning Usage

Key Principle: PII detection and PDF conversion use Anthropic's Claude API for text analysis. Original document files always remain on our infrastructure; only extracted text is sent for analysis. OCR processing is fully self-hosted. Document content is never stored — only issue metadata.

5.1 AI Components

Component	Purpose	Processing	Data Sent Externally
PII Detection Engine	Identify personal information patterns	Anthropic Claude API	Document text only (not files)
Name Recognition	Identify human names in context	Anthropic Claude API	Document text only (not files)
PDF to HTML Conversion	Convert PDF pages to structured accessible HTML	Anthropic Claude API	Document text only (not files)
OCR Processing	Extract text from images/scans	Self-hosted (Tesseract)	None

5.2 What We Don't Do

We do not send original document files externally — only extracted text for analysis
We do not use customer documents to train or fine-tune AI models
We do not store AI inference results beyond the immediate response
We do not use AI for automated decision-making that affects users

5.3 Third-Party AI Provider

SentraCheck uses Anthropic's Claude API for PII detection, text analysis, and PDF-to-HTML conversion. Key details:

What is sent: Extracted text content only (not original files, images, or metadata)
What is NOT sent: Original document files, user account information, or file names
Data retention: Anthropic's enterprise data retention policies apply (see Anthropic Privacy Policy)
Model training: Customer data is not used to train Anthropic's models under our enterprise agreement

5.4 Self-Hosted Components

The following components run entirely on our infrastructure with no external API calls:

OCR Processing: Tesseract OCR for text extraction from images and scanned documents
ADA Accessibility Checking: WCAG validation rules
Compliance Rule Engine: Pattern matching for regulatory requirements

Customer documents are never used for model training or improvement.

6. Logging & Audit Trail

6.1 What We Log

Event Type	Data Captured	Retention
Authentication	User ID, timestamp, IP, success/failure, user agent	90 days
API Requests	Endpoint, user ID, timestamp, response code, latency	90 days
Document Scans	User ID, timestamp, page count, processing time (NO content)	90 days
Admin Actions	Admin ID, action type, target, timestamp	2 years
Security Events	Event type, source IP, details, timestamp	1 year

6.2 What We Don't Log

Document content or file contents
The actual PII values found (only flagged — type, location, severity — stored encrypted in findings)
User passwords (hashed only, never logged)

6.3 Log Access

Logs are encrypted at rest (AES-256)
Access restricted to authorized security personnel
All log access is itself logged
Logs available for customer audit upon request (for their own data)

7. Access Controls

7.1 Employee Access

Principle of least privilege: Employees only have access required for their role
MFA available: Multi-factor authentication supported for employee accounts
Access reviews: Quarterly review of all access permissions
Separation of duties: Production access separated from development
Background checks: Required for all employees with system access

7.2 Customer Access

Email/password authentication with optional MFA
Configurable session timeout
Concurrent session limits
Role-based access within organizations (Admin, User, Read-only)

8. Incident Response

SentraCheck maintains a documented incident response plan:

Detection: Automated monitoring and alerting (24/7)
Response: On-call security team with 15-minute response SLA
Notification: Customer notification within 72 hours of confirmed breach
Recovery: Documented recovery procedures with regular testing

9. Compliance & Certifications

Standard	Status	Notes
CCPA / CPRA	Compliant	Zero content retention model
NIST 800-122	Aligned	Documented controls
NIST 800-171	Aligned	Control mapping available under NDA
Section 508	Conformant	VPAT 2.4 available
WCAG 2.1 Level AA	Conformant	Self-assessed; third-party audit Q3 2026
AWS SOC 2	Inherited	Via hosting provider (AWS US-West-1)
SOC 2 Type II (own)	Planned 2027	Roadmap item
FedRAMP	Path planned	Enterprise tier, late 2026
GDPR	Compliant	DPA available upon request

10. Security Contact

For security-related inquiries, vulnerability reports, or to request documentation for audits:

Email: security@sentracheck.com
Response SLA: 24 hours for security inquiries
Vulnerability Reports: Acknowledged within 24 hours, triaged within 72 hours

Audit Documentation: Additional technical documentation, penetration test reports, and compliance attestations are available under NDA. Contact security@sentracheck.com to request access.