Beispiele

Praxisnahe Code-Beispiele, die die vier Prinzipien demonstrieren. Jedes Beispiel zeigt, was LLMs typischerweise falsch machen und wie es besser geht.

1. Think Before Coding

Beispiel 1: Versteckte Annahmen

User Request: "Add a feature to export user data"

❌ Was LLMs tun (falsche Annahmen)

def export_users(format='json'):
    """Export all users to a file."""
    users = User.query.all()
    
    if format == 'json':
        with open('users.json', 'w') as f:
            json.dump([u.to_dict() for u in users], f)
    elif format == 'csv':
        with open('users.csv', 'w') as f:
            writer = csv.DictWriter(f, fieldnames=['id', 'email', 'name'])
            writer.writeheader()
            writer.writerows([u.to_dict() for u in users])
    
    return f"Exported {len(users)} users"

Probleme:

Es wurde angenommen, dass alle Nutzer exportiert werden sollen
Der Dateipfad wurde ohne Rückfrage angenommen
Die zu exportierenden Felder wurden stillschweigend gewählt
CSV-Feldnamen wurden festgelegt, ohne die echte Datenstruktur zu prüfen

✅ Was passieren sollte (Annahmen offenlegen)

Before implementing, I need to clarify:

1. **Scope**: Export all users or filtered subset? (privacy implications)
2. **Format**: You mentioned "export" - did you mean:
   - Download file in browser?
   - Background job with email notification?
   - API endpoint returning data?
3. **Fields**: Which user fields? (some might be sensitive)
4. **Volume**: How many users typically? (affects approach)

Simplest approach: Add an API endpoint that returns paginated JSON.
Would need more info for file-based exports.

What's your preference?

Beispiel 2: Mehrere Interpretationen

User Request: "Make the search faster"

❌ Was LLMs tun (stillschweigend auswählen)

# Adds caching, database indexes, and async processing
# without asking which "faster" matters
@lru_cache(maxsize=1000)
async def search(query: str) -> List[Result]:
    # ... 200 lines of optimization code

✅ Was passieren sollte

"Make the search faster" could mean:

1. **Faster response time** (< 100ms instead of 500ms)
   - Add database indexes
   - Cache frequent queries
   - Estimated effort: 2 hours

2. **Handle more concurrent searches** (throughput)
   - Add async/await
   - Connection pooling
   - Estimated effort: 4 hours

3. **Faster perceived speed** (UX)
   - Show partial results immediately
   - Progressive loading
   - Estimated effort: 3 hours

Current search takes ~500ms for typical queries. 
Which aspect matters most for your use case?

2. Simplicity First

Beispiel 1: Über-Abstraktion

User Request: "Add a function to calculate discount"

❌ Was LLMs tun (overengineered)

from abc import ABC, abstractmethod
from enum import Enum
from typing import Protocol, Union
from dataclasses import dataclass

class DiscountStrategy(ABC):
    @abstractmethod
    def calculate(self, amount: float) -> float:
        pass

class PercentageDiscount(DiscountStrategy):
    def __init__(self, percentage: float):
        self.percentage = percentage
    
    def calculate(self, amount: float) -> float:
        return amount * (self.percentage / 100)

class FixedDiscount(DiscountStrategy):
    def __init__(self, fixed_amount: float):
        self.fixed_amount = fixed_amount
    
    def calculate(self, amount: float) -> float:
        return min(self.fixed_amount, amount)

@dataclass
class DiscountConfig:
    strategy: DiscountStrategy
    min_purchase: float = 0.0
    max_discount: float = float('inf')

class DiscountCalculator:
    def __init__(self, config: DiscountConfig):
        self.config = config
    
    def apply_discount(self, amount: float) -> float:
        if amount < self.config.min_purchase:
            return 0
        discount = self.config.strategy.calculate(amount)
        return min(discount, self.config.max_discount)

# Usage requires 30+ lines of setup for a simple calculation

✅ Was passieren sollte (einfach)

def calculate_discount(amount: float, percent: float) -> float:
    """Calculate discount amount. percent should be 0-100."""
    return amount * (percent / 100)

# Usage
discount = calculate_discount(100.0, 10.0)  # $10 off

Wann Komplexität sinnvoll wird: Erst dann, wenn wirklich mehrere Discount-Typen gebraucht werden. Kommt diese Anforderung später, wird dann refactored.

Beispiel 2: Spekulative Features

User Request: "Save user preferences to database"

❌ Was LLMs tun (unnötige Features)

class PreferenceManager:
    def __init__(self, db, cache=None, validator=None):
        self.db = db
        self.cache = cache or InMemoryCache()
        self.validator = validator or DefaultValidator()
    
    def save(self, user_id: int, prefs: dict, 
             merge: bool = True, 
             validate: bool = True,
             notify: bool = False) -> bool:
        """Save preferences with optional merging, validation, and notifications."""
        
        if validate:
            errors = self.validator.validate(prefs)
            if errors:
                raise ValidationError(errors)
        
        existing = self.get(user_id) if merge else {}
        
        if merge:
            existing.update(prefs)
            prefs = existing
        
        self.db.execute(
            "UPDATE users SET preferences = ? WHERE id = ?",
            (json.dumps(prefs), user_id)
        )
        
        if self.cache:
            self.cache.set(f"prefs:{user_id}", prefs)
        
        if notify:
            self.notify_preference_change(user_id, prefs)
        
        return True
    
    def notify_preference_change(self, user_id: int, prefs: dict):
        # Another 30 lines for a feature nobody asked for
        pass

✅ Was passieren sollte (genau das Angefragte)

def save_preferences(db, user_id: int, preferences: dict):
    """Save user preferences to database."""
    db.execute(
        "UPDATE users SET preferences = ? WHERE id = ?",
        (json.dumps(preferences), user_id)
    )

Später ergänzen, wenn nötig: Caching, Validierung und Merge-Logik erst dann ergänzen, wenn der Bedarf real ist.

3. Surgical Changes

Beispiel 1: Drive-by-Refactoring

User Request: "Fix the bug where empty emails crash the validator"

❌ Was LLMs tun (zu viel auf einmal)

  def validate_user(user_data):
-     # Check email format
-     if not user_data.get('email'):
+     """Validate user data."""
+     email = user_data.get('email', '').strip()
+     
+     # Validate email
+     if not email:
          raise ValueError("Email required")
-     
-     # Basic email validation
-     if '@' not in user_data['email']:
+     if '@' not in email or '.' not in email.split('@')[1]:
          raise ValueError("Invalid email")
      
-     # Check username
-     if not user_data.get('username'):
+     # Validate username
+     username = user_data.get('username', '').strip()
+     if not username:
          raise ValueError("Username required")
+     if len(username) < 3:
+         raise ValueError("Username too short")
+     if not username.isalnum():
+         raise ValueError("Username must be alphanumeric")
      
      return True

Probleme:

Die E-Mail-Validierung wurde über den eigentlichen Bugfix hinaus verändert
Zusätzliche Username-Validierung wurde eingebaut, obwohl sie nicht angefragt war
Kommentare wurden verändert
Eine Docstring wurde ergänzt

✅ Was passieren sollte (chirurgisch)

  def validate_user(user_data):
      # Check email format
-     if not user_data.get('email'):
+     email = user_data.get('email', '')
+     if not email or not email.strip():
          raise ValueError("Email required")
      
      # Basic email validation
-     if '@' not in user_data['email']:
+     if '@' not in email:
          raise ValueError("Invalid email")
      
      # Check username
      if not user_data.get('username'):
          raise ValueError("Username required")
      
      return True

Geändert wurde nur: genau der Teil, der den Bug mit leeren E-Mails behebt.

Beispiel 2: Style Drift

User Request: "Add logging to the upload function"

❌ Was LLMs tun (alles umformatieren)

- def upload_file(file_path, destination):
+ def upload_file(file_path: str, destination: str) -> bool:
+     """Upload file to destination with logging."""
+     logger.info(f"Uploading {file_path} to {destination}")
+     
      try:
-         with open(file_path, 'rb') as f:
-             data = f.read()
-         
-         response = requests.post(destination, files={'file': data})
-         
-         if response.status_code == 200:
-             return True
-         else:
-             return False
+         with open(file_path, "rb") as f:
+             data = f.read()
+         response = requests.post(destination, files={"file": data})
+         success = response.status_code == 200
+         
+         if success:
+             logger.info(f"Upload successful: {file_path}")
+         else:
+             logger.error(f"Upload failed: {file_path}, status={response.status_code}")
+         
+         return success
      except Exception as e:
-         print(f"Error: {e}")
+         logger.exception(f"Upload error for {file_path}: {e}")
          return False

Probleme:

Quote-Stil wurde geändert
Type Hints wurden hinzugefügt, obwohl sie nicht angefragt waren
Eine Docstring wurde ergänzt
Whitespace wurde umformatiert
Die Return-Logik wurde verändert

✅ Was passieren sollte (bestehenden Stil matchen)

+ import logging
+ 
+ logger = logging.getLogger(__name__)
+ 
  def upload_file(file_path, destination):
+     logger.info(f'Starting upload: {file_path}')
      try:
          with open(file_path, 'rb') as f:
              data = f.read()
          
          response = requests.post(destination, files={'file': data})
          
          if response.status_code == 200:
+             logger.info(f'Upload successful: {file_path}')
              return True
          else:
+             logger.error(f'Upload failed: {file_path}, status={response.status_code}')
              return False
      except Exception as e:
-         print(f"Error: {e}")
+         logger.exception(f'Upload error: {file_path}')
          return False

Beibehalten wurde: Single Quotes, keine Type Hints, bestehendes Boolean-Muster und bestehender Spacing-Stil.

4. Goal-Driven Execution

Beispiel 1: Vage vs. überprüfbar

User Request: "Fix the authentication system"

❌ Was LLMs tun (vager Ansatz)

I'll fix the authentication system by:
1. Reviewing the code
2. Identifying issues
3. Making improvements
4. Testing the changes

[Proceeds to make changes without clear success criteria]

✅ Was passieren sollte (überprüfbare Ziele)

To "fix authentication," I need to define success criteria. 
What specific issue are we solving?

If the issue is "users stay logged in after password change":

Plan:
1. Write test: Change password → verify old session invalidated
   Verify: Test fails (reproduces bug)

2. Implement: Invalidate sessions on password change
   Verify: Test passes

3. Check edge cases: Multiple active sessions, concurrent changes
   Verify: Additional tests pass

4. Verify no regression: Existing auth tests still pass
   Verify: Full test suite green

Current test coverage for auth: [shows coverage]
What's the specific auth issue you're seeing?

Beispiel 2: Mehrere Schritte mit Verifikation

User Request: "Add rate limiting to the API"

❌ Was LLMs tun (alles auf einmal)

# Implements full rate limiting with Redis, multiple strategies,
# configuration system, and monitoring in one 300-line commit
# No clear verification steps

✅ Was passieren sollte (inkrementell mit Verifikation)

Plan for rate limiting:

1. Add basic in-memory rate limiting (single endpoint)
   Verify: 
   - Test: 100 requests → first 10 succeed, rest get 429
   - Manual: curl endpoint 11 times, see rate limit error

2. Extract to middleware (apply to all endpoints)
   Verify:
   - Test: Rate limits apply to /users and /posts
   - Existing endpoint tests still pass

3. Add Redis backend (for multi-server)
   Verify:
   - Test: Rate limit persists across app restarts
   - Test: Two app instances share rate limit counter

4. Add configuration (rates per endpoint)
   Verify:
   - Test: /search allows 10/min, /users allows 100/min
   - Config file parsed correctly

Each step is independently verifiable and deployable.
Start with step 1?

Beispiel 3: Test-First-Verifikation

User Request: "The sorting breaks when there are duplicate scores"

❌ Was LLMs tun (fixen ohne Reproduktion)

# Immediately changes sort logic without confirming the bug
def sort_scores(scores):
    return sorted(scores, key=lambda x: (-x['score'], x['name']))

✅ Was passieren sollte (erst reproduzieren)

# 1. First, write a test that reproduces the issue
def test_sort_with_duplicate_scores():
    """Test sorting when multiple items have same score."""
    scores = [
        {'name': 'Alice', 'score': 100},
        {'name': 'Bob', 'score': 100},
        {'name': 'Charlie', 'score': 90},
    ]
    
    result = sort_scores(scores)
    
    # The bug: order is non-deterministic for duplicates
    # Run this test multiple times, it should be consistent
    assert result[0]['score'] == 100
    assert result[1]['score'] == 100
    assert result[2]['score'] == 90

# Verify: Run test 10 times → fails with inconsistent ordering

# 2. Now fix with stable sort
def sort_scores(scores):
    """Sort by score descending, then name ascending for ties."""
    return sorted(scores, key=lambda x: (-x['score'], x['name']))

# Verify: Test passes consistently

Zusammenfassung der Anti-Patterns

Prinzip	Anti-Pattern	Besser
Think Before Coding	Dateiformat, Felder oder Scope stillschweigend annehmen	Annahmen explizit auflisten und Rückfragen stellen
Simplicity First	Strategy Pattern für eine einzelne Rabattberechnung	Eine Funktion, bis echte Komplexität gebraucht wird
Surgical Changes	Quotes umformatieren und Type Hints ergänzen, während ein Bug gefixt wird	Nur die Zeilen ändern, die das gemeldete Problem beheben
Goal-Driven	"I'll review and improve the code"	"Write test for bug X → make it pass → verify no regressions"

Zentrale Erkenntnis

Die "überkomplizierten" Beispiele sind nicht offensichtlich falsch - sie folgen Design Patterns und Best Practices. Das Problem ist das Timing: Sie führen Komplexität ein, bevor sie gebraucht wird. Das:

macht Code schwerer verständlich
führt zu mehr Bugs
kostet mehr Zeit bei der Umsetzung
erschwert Tests

Die einfachen Versionen sind:

leichter zu verstehen
schneller umzusetzen
leichter zu testen
später immer noch refactorbar, wenn echte Komplexität entsteht

Guter Code ist Code, der das heutige Problem einfach löst - nicht Code, der das morgige Problem vorschnell antizipiert.

1. Think Before Coding​

Beispiel 1: Versteckte Annahmen​

Beispiel 2: Mehrere Interpretationen​

2. Simplicity First​

Beispiel 1: Über-Abstraktion​

Beispiel 2: Spekulative Features​

3. Surgical Changes​

Beispiel 1: Drive-by-Refactoring​

Beispiel 2: Style Drift​

4. Goal-Driven Execution​

Beispiel 1: Vage vs. überprüfbar​

Beispiel 2: Mehrere Schritte mit Verifikation​

Beispiel 3: Test-First-Verifikation​

Zusammenfassung der Anti-Patterns​

Zentrale Erkenntnis​

1. Think Before Coding

Beispiel 1: Versteckte Annahmen

Beispiel 2: Mehrere Interpretationen

2. Simplicity First

Beispiel 1: Über-Abstraktion

Beispiel 2: Spekulative Features

3. Surgical Changes

Beispiel 1: Drive-by-Refactoring

Beispiel 2: Style Drift

4. Goal-Driven Execution

Beispiel 1: Vage vs. überprüfbar

Beispiel 2: Mehrere Schritte mit Verifikation

Beispiel 3: Test-First-Verifikation

Zusammenfassung der Anti-Patterns

Zentrale Erkenntnis