15 KiB
Fall-ID Anonymisierung — Implementation Plan
For Claude: REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
Goal: Replace patient last names in fall_ids with KVNR (or 6-char random fallback), retroactively migrate all existing fall_ids, and auto-update fall_id when KVNR is later entered.
Architecture: Modify generate_fall_id() to use KVNR instead of Nachname. Add a helper generate_random_suffix() for cases without KVNR. Write an Alembic migration that rebuilds all existing fall_ids. Extend the PUT /cases/{case_id}/kvnr endpoint to update the fall_id when the current suffix is a random sequence (not a KVNR).
Tech Stack: Python 3.12, FastAPI, SQLAlchemy 2.0, Alembic, Pydantic v2, pytest
Task 1: Update generate_fall_id() and add generate_random_suffix()
Files:
- Modify:
backend/app/services/import_service.py:22-30 - Test:
backend/tests/test_import.py
Step 1: Write failing tests for new fall_id format
Add these tests to backend/tests/test_import.py in TestGenerateFallId:
def test_uses_kvnr_when_available(self):
"""fall_id uses KVNR instead of Nachname when KVNR is present."""
pc = _make_parsed_case(nachname="Tonn", kvnr="A123456789", fallgruppe="onko", jahr=2026, kw=6)
result = generate_fall_id(pc)
assert result == "2026-06-onko-A123456789"
assert "Tonn" not in result
def test_random_suffix_when_no_kvnr(self):
"""fall_id uses 6-char random suffix when KVNR is missing."""
pc = _make_parsed_case(nachname="Tonn", kvnr=None, fallgruppe="onko", jahr=2026, kw=6)
result = generate_fall_id(pc)
# Format: YYYY-KW-fallgruppe-XXXXXX (6 alphanumeric uppercase+digits)
parts = result.split("-")
assert parts[0] == "2026"
assert parts[1] == "06"
assert parts[2] == "onko"
suffix = parts[3]
assert len(suffix) == 6
assert suffix.isalnum()
assert suffix == suffix.upper() # Only uppercase + digits
assert "Tonn" not in result
def test_random_suffix_when_empty_kvnr(self):
"""fall_id uses random suffix when KVNR is empty string."""
pc = _make_parsed_case(nachname="Tonn", kvnr="", fallgruppe="onko", jahr=2026, kw=6)
result = generate_fall_id(pc)
parts = result.split("-")
suffix = parts[3]
assert len(suffix) == 6
assert suffix.isalnum()
def test_random_suffixes_are_unique(self):
"""Different calls produce different random suffixes."""
pc = _make_parsed_case(kvnr=None)
ids = {generate_fall_id(pc) for _ in range(20)}
assert len(ids) >= 15 # Allow small collision chance
Also add a new test for generate_random_suffix:
from app.services.import_service import generate_random_suffix
class TestGenerateRandomSuffix:
def test_length(self):
assert len(generate_random_suffix()) == 6
def test_charset(self):
"""Only uppercase letters and digits."""
import re
for _ in range(50):
s = generate_random_suffix()
assert re.match(r'^[A-Z0-9]{6}$', s)
Step 2: Run tests to verify they fail
Run: cd /home/frontend/dak_c2s/backend && python -m pytest tests/test_import.py -v -k "kvnr or random_suffix"
Expected: FAIL — tests reference new behavior/function not yet implemented.
Step 3: Implement generate_random_suffix() and update generate_fall_id()
In backend/app/services/import_service.py, replace the existing generate_fall_id function and add generate_random_suffix:
import random
import string
def generate_random_suffix(length: int = 6) -> str:
"""Generate a random alphanumeric suffix (uppercase + digits)."""
charset = string.ascii_uppercase + string.digits
return "".join(random.choices(charset, k=length))
def generate_fall_id(parsed: ParsedCase) -> str:
"""Generate unique fall_id: YYYY-KW02d-fallgruppe-KVNR.
Uses KVNR as identifier. Falls back to 6-char random suffix if
KVNR is missing or empty.
Examples:
- 2026-06-onko-A123456789
- 2026-12-kardio-X7K9M2 (random fallback)
"""
suffix = parsed.kvnr if parsed.kvnr else generate_random_suffix()
return f"{parsed.jahr}-{parsed.kw:02d}-{parsed.fallgruppe}-{suffix}"
Step 4: Update existing tests that expect Nachname in fall_id
Update test_format and other tests in TestGenerateFallId that assert Nachname-based fall_ids. Since the default _make_parsed_case has kvnr="D410126355", the fall_id will now use that. Specifically update:
test_format: expect"2026-06-onko-D410126355"(not"2026-06-onko-Tonn")test_different_cases_produce_different_ids: change to use different KVNRstest_same_patient_same_week_same_fallgruppe: same KVNR → same fall_idtest_umlauts_preserved: pass kvnr=None, check random suffix format insteadtest_hyphenated_name: pass kvnr=None, check random suffix format insteadtest_all_fallgruppen: remains valid (checks-{fg}-pattern)test_year_boundary: remains valid (checks year prefix)
In TestImportRowSchema.test_full_row update fall_id="2026-06-kardio-D410126355".
In TestPreviewImportMocked tests that check fall_id is not None: still valid.
Step 5: Run all tests to verify they pass
Run: cd /home/frontend/dak_c2s/backend && python -m pytest tests/test_import.py -v
Expected: ALL PASS
Step 6: Update module docstring
Change the docstring at top of import_service.py from:
- fall_id generation: YYYY-KW02d-fallgruppe-Nachname
to:
- fall_id generation: YYYY-KW02d-fallgruppe-KVNR (or 6-char random suffix)
Step 7: Commit
git add backend/app/services/import_service.py backend/tests/test_import.py
git commit -m "feat: use KVNR instead of Nachname in fall_id generation"
Task 2: Update check_duplicate() for new fall_id format
Files:
- Modify:
backend/app/services/import_service.py:33-59 - Test:
backend/tests/test_import.py
Step 1: Analyze impact
check_duplicate() currently checks:
- Exact fall_id match
- Personal data match (nachname + fallgruppe + datum + optional vorname/geburtsdatum)
With the new fall_id format:
- Criterion 1 still works if same KVNR → same fall_id (deterministic).
- For random-suffix cases, fall_id match won't catch duplicates → criterion 2 is essential.
- Criterion 2 (personal data match) stays as-is — it doesn't depend on fall_id format.
No code change needed for check_duplicate() — it already has both detection paths. The personal data match (criterion 2) handles the random-suffix case correctly.
Step 2: Write a verification test
Add to TestCheckDuplicateMocked:
def test_duplicate_detected_by_personal_data_when_kvnr_missing(self):
"""Duplicate detected by personal data even when fall_id uses random suffix."""
db = MagicMock()
query = MagicMock()
db.query.return_value = query
query.filter.return_value = query
# First .first() (fall_id check) → no match, second .first() (personal data) → match
query.first.side_effect = [None, MagicMock()]
pc = _make_parsed_case(kvnr=None)
assert check_duplicate(db, pc) is True
Step 3: Run test
Run: cd /home/frontend/dak_c2s/backend && python -m pytest tests/test_import.py::TestCheckDuplicateMocked -v
Expected: ALL PASS
Step 4: Commit
git add backend/tests/test_import.py
git commit -m "test: verify duplicate detection works with new fall_id format"
Task 3: Retroactive migration of existing fall_ids
Files:
- Create:
backend/alembic/versions/006_anonymize_fall_ids.py
Step 1: Write the Alembic migration
This migration:
- Reads all cases with their fall_id and kvnr
- For each case, regenerates the fall_id as
YYYY-KW-fallgruppe-KVNR(or random suffix) - Updates in bulk
"""Anonymize fall_ids: replace Nachname with KVNR or random suffix.
Revision ID: 006_anonymize_fall_ids
Revises: 005_add_disclosure_requests
"""
import random
import string
from alembic import op
import sqlalchemy as sa
revision = "006_anonymize_fall_ids"
down_revision = "005_add_disclosure_requests"
branch_labels = None
depends_on = None
def _random_suffix(length=6):
charset = string.ascii_uppercase + string.digits
return "".join(random.choices(charset, k=length))
def upgrade():
conn = op.get_bind()
cases = conn.execute(
sa.text("SELECT id, fall_id, kvnr, jahr, kw, fallgruppe FROM cases")
).fetchall()
for case in cases:
case_id, old_fall_id, kvnr, jahr, kw, fallgruppe = case
suffix = kvnr if kvnr else _random_suffix()
new_fall_id = f"{jahr}-{kw:02d}-{fallgruppe}-{suffix}"
conn.execute(
sa.text("UPDATE cases SET fall_id = :new_id WHERE id = :case_id"),
{"new_id": new_fall_id, "case_id": case_id},
)
def downgrade():
# Cannot restore original Nachname-based fall_ids — data is lost
pass
Step 2: Run migration locally (dry-run check)
Run: cd /home/frontend/dak_c2s/backend && python -c "from alembic.versions import *; print('syntax ok')"
The actual migration will be run on production during deploy (Task 5).
Step 3: Commit
git add backend/alembic/versions/006_anonymize_fall_ids.py
git commit -m "feat: add migration to anonymize existing fall_ids"
Task 4: Auto-update fall_id when KVNR is entered
Files:
- Modify:
backend/app/api/cases.py:417-452(set_case_kvnr endpoint) - Test: manual verification after deploy
Step 1: Add helper to detect random-suffix fall_ids
In backend/app/services/import_service.py, add:
import re
# KVNR format: letter followed by 9 digits (e.g. A123456789)
KVNR_PATTERN = re.compile(r'^[A-Z]\d{9}$')
def has_random_suffix(fall_id: str) -> bool:
"""Check if a fall_id ends with a random suffix (not a KVNR).
Random suffix: exactly 6 alphanumeric chars (uppercase + digits).
KVNR: letter + 9 digits (10 chars total).
Nachname: any other string (legacy format, also treated as non-KVNR).
"""
if not fall_id:
return False
parts = fall_id.rsplit("-", 1)
if len(parts) < 2:
return False
suffix = parts[1]
# If it matches KVNR pattern, it's NOT a random suffix
if KVNR_PATTERN.match(suffix):
return False
# Otherwise it's either a random suffix or a legacy Nachname — either way, should be updated
return True
Step 2: Write tests for has_random_suffix()
Add to backend/tests/test_import.py:
from app.services.import_service import has_random_suffix
class TestHasRandomSuffix:
def test_kvnr_suffix(self):
"""Fall_id with KVNR is NOT random."""
assert has_random_suffix("2026-06-onko-A123456789") is False
def test_random_suffix(self):
"""Fall_id with 6-char random suffix IS random."""
assert has_random_suffix("2026-06-onko-X7K9M2") is True
def test_legacy_nachname_suffix(self):
"""Fall_id with legacy Nachname suffix IS treated as non-KVNR."""
assert has_random_suffix("2020-32-onko-Bartl-Zimmermann") is True
assert has_random_suffix("2026-06-kardio-Tonn") is True
def test_empty_fall_id(self):
assert has_random_suffix("") is False
assert has_random_suffix(None) is False
Step 3: Run tests
Run: cd /home/frontend/dak_c2s/backend && python -m pytest tests/test_import.py::TestHasRandomSuffix -v
Expected: ALL PASS
Step 4: Modify set_case_kvnr endpoint
In backend/app/api/cases.py, update the endpoint to also update the fall_id:
from app.services.import_service import has_random_suffix
@router.put("/{case_id}/kvnr", response_model=CaseResponse)
def set_case_kvnr(
case_id: int,
payload: dict,
request: Request,
db: Session = Depends(get_db),
user: User = Depends(get_current_user),
):
"""Update the KVNR for a case. Accessible to both admin and dak_mitarbeiter.
If the current fall_id has a random suffix (no KVNR), it is automatically
updated to use the new KVNR.
"""
case = db.query(Case).filter(Case.id == case_id).first()
if not case:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail="Case not found",
)
old_kvnr = case.kvnr
old_fall_id = case.fall_id
new_kvnr = payload.get("kvnr")
case.kvnr = new_kvnr
# Auto-update fall_id if it currently uses a random suffix or legacy Nachname
if new_kvnr and case.fall_id and has_random_suffix(case.fall_id):
prefix = case.fall_id.rsplit("-", 1)[0]
case.fall_id = f"{prefix}-{new_kvnr}"
case.updated_by = user.id
db.commit()
db.refresh(case)
log_action(
db,
user_id=user.id,
action="kvnr_updated",
entity_type="case",
entity_id=case.id,
old_values={"kvnr": old_kvnr, "fall_id": old_fall_id},
new_values={"kvnr": new_kvnr, "fall_id": case.fall_id},
ip_address=request.client.host if request.client else None,
user_agent=request.headers.get("user-agent"),
)
return case
Note: The rsplit("-", 1) approach works for the standard format YYYY-KW-fallgruppe-suffix. For legacy Nachnames with hyphens like 2020-32-onko-Bartl-Zimmermann, rsplit with maxsplit=1 gives ("2020-32-onko-Bartl", "Zimmermann") which is wrong. Better approach: reconstruct from case fields.
Corrected implementation — use case fields instead of string parsing:
if new_kvnr and case.fall_id and has_random_suffix(case.fall_id):
case.fall_id = f"{case.jahr}-{case.kw:02d}-{case.fallgruppe}-{new_kvnr}"
This is more robust because it uses the actual database fields.
Step 5: Commit
git add backend/app/services/import_service.py backend/app/api/cases.py backend/tests/test_import.py
git commit -m "feat: auto-update fall_id when KVNR is entered on a case"
Task 5: Build, deploy, and run migration
Files:
- Frontend:
frontend/(build) - Production DB: run Alembic migration
Step 1: Build frontend
Run: cd /home/frontend/dak_c2s/frontend && pnpm build
Expected: Build succeeds (no frontend code changes needed for this feature)
Step 2: Run all backend tests
Run: cd /home/frontend/dak_c2s/backend && python -m pytest tests/ -v
Expected: ALL PASS
Step 3: Commit all, push develop, merge to main
cd /home/frontend/dak_c2s
git push origin develop
git checkout main && git pull origin main && git merge develop && git push origin main
git checkout develop
Step 4: Deploy to Hetzner 1
# Pull latest code
ssh hetzner1 "cd /opt/dak-portal && git pull origin main"
# Run migration (updates all ~2900 fall_ids)
ssh hetzner1 "cd /opt/dak-portal/backend && source /opt/dak-portal/venv/bin/activate && alembic upgrade head"
# Build and deploy frontend
ssh hetzner1 "cd /opt/dak-portal/frontend && pnpm install && pnpm build && cp -r dist/* /var/www/vhosts/complexcaresolutions.de/dak.complexcaresolutions.de/dist/"
# Restart backend
ssh hetzner1 "systemctl restart dak-backend"
Step 5: Verify on production
- Check fall_ids no longer contain patient names:
ssh hetzner1 "mysql -u dak_c2s_admin -p'R@TIa&s7ygxm4x0b' dak_c2s -e \"SELECT fall_id FROM cases LIMIT 20\"" - Import a new CSV → verify new fall_ids use KVNR format
- Enter KVNR on a case with random suffix → verify fall_id updates automatically