Advanced Error Handling and Debugging
50 minError handling is crucial for building robust Python applications that gracefully handle failures and provide meaningful feedback. Python's exception system allows you to catch, handle, and propagate errors in a structured way. Understanding different error types (syntax errors, runtime errors, logical errors) and how to handle them appropriately is essential for writing production-quality code. Good error handling improves user experience, makes debugging easier, and prevents applications from crashing unexpectedly.
Python's exception hierarchy starts with `BaseException` (all exceptions inherit from this) and `Exception` (most user-defined exceptions inherit from this). Built-in exceptions include `ValueError` (wrong value type), `TypeError` (wrong type), `KeyError`/`IndexError` (missing keys/indices), `FileNotFoundError` (file doesn't exist), and many more. You can catch specific exceptions, multiple exceptions, or use `except Exception` to catch all exceptions (though this should be used carefully). The `else` clause runs when no exception occurs, and `finally` always runs for cleanup.
Custom exceptions allow you to create domain-specific error types that clearly communicate what went wrong. Create custom exceptions by inheriting from `Exception` (or more specific exceptions) and adding relevant attributes. Exception hierarchies enable catching related errors together (catch the base class to catch all subclasses). Custom exceptions should be descriptive, include relevant context, and follow naming conventions (end with 'Error'). They make error handling more precise and code more maintainable.
Python's debugging tools include `pdb` (Python debugger for interactive debugging), `traceback` (for extracting and formatting stack traces), `logging` (for recording application events and errors), `assert` statements (for debugging assumptions), and IDE debuggers. `pdb` allows you to set breakpoints, step through code, inspect variables, and evaluate expressions. The `logging` module provides flexible logging with different levels (DEBUG, INFO, WARNING, ERROR, CRITICAL) and multiple handlers (console, file, network). Proper logging is essential for debugging production issues.
Error handling strategies include fail-fast (raise exceptions immediately when errors are detected), fail-safe (handle errors gracefully and continue), and defensive programming (validate inputs, handle edge cases). The choice depends on context—critical errors should fail fast, while recoverable errors can be handled gracefully. Always log errors with sufficient context (what happened, where, when, why) for debugging. Use specific exception types rather than catching everything, and let exceptions propagate when you can't handle them meaningfully.
Best practices include using specific exception types, providing meaningful error messages, logging errors with context, using try-except-finally for resource cleanup, creating custom exceptions for domain-specific errors, and testing error handling paths. Avoid bare `except:` clauses (catch `Exception` explicitly), don't suppress exceptions silently, and don't use exceptions for control flow. Good error handling makes applications more robust, debuggable, and user-friendly.
Key Concepts
- Python's exception system handles errors in a structured way.
- Custom exceptions enable domain-specific error types.
- try-except-finally blocks handle errors and ensure cleanup.
- pdb, traceback, and logging are essential debugging tools.
- Good error handling improves robustness and debuggability.
Learning Objectives
Master
- Creating custom exceptions for domain-specific errors
- Using try-except-finally for proper error handling and cleanup
- Working with Python debugging tools (pdb, traceback, logging)
- Implementing effective error handling strategies
Develop
- Defensive programming and error handling thinking
- Understanding exception propagation and handling
- Designing robust, fault-tolerant applications
Tips
- Use specific exception types rather than catching everything.
- Always log errors with sufficient context for debugging.
- Use try-except-finally for resource cleanup.
- Create custom exceptions for domain-specific error types.
Common Pitfalls
- Using bare except: clauses, catching system-exiting exceptions.
- Suppressing exceptions silently, making debugging impossible.
- Using exceptions for control flow, which is inefficient and unclear.
- Not providing meaningful error messages, making debugging difficult.
Summary
- Python's exception system provides structured error handling.
- Custom exceptions enable domain-specific, meaningful error types.
- try-except-finally ensures proper error handling and cleanup.
- Debugging tools (pdb, traceback, logging) are essential for troubleshooting.
- Good error handling makes applications robust and maintainable.
Exercise
Implement a comprehensive error handling and debugging system with custom exceptions, logging, debugging utilities, and error recovery mechanisms.
import traceback
import logging
import sys
from typing import Any, Dict, List, Optional, Callable
from datetime import datetime
import json
# Configure comprehensive logging
logging.basicConfig(
level=logging.DEBUG,
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
handlers=[
logging.FileHandler('application.log'),
logging.StreamHandler(sys.stdout)
]
)
logger = logging.getLogger(__name__)
class ApplicationError(Exception):
"""Base exception class for application-specific errors."""
def __init__(self, message: str, error_code: str = None, details: Dict[str, Any] = None):
super().__init__(message)
self.message = message
self.error_code = error_code or "UNKNOWN_ERROR"
self.details = details or {}
self.timestamp = datetime.now()
self.traceback = traceback.format_exc()
def to_dict(self) -> Dict[str, Any]:
"""Convert exception to dictionary for logging or API responses."""
return {
"error_type": self.__class__.__name__,
"message": self.message,
"error_code": self.error_code,
"details": self.details,
"timestamp": self.timestamp.isoformat(),
"traceback": self.traceback
}
def __str__(self):
return f"{self.__class__.__name__}: {self.message} (Code: {self.error_code})"
class ValidationError(ApplicationError):
"""Exception raised when data validation fails."""
def __init__(self, field: str, value: Any, rule: str, message: str = None):
super().__init__(
message or f"Validation failed for field '{field}' with value '{value}' using rule '{rule}'",
"VALIDATION_ERROR",
{"field": field, "value": value, "rule": rule}
)
self.field = field
self.value = value
self.rule = rule
class DatabaseError(ApplicationError):
"""Exception raised when database operations fail."""
def __init__(self, operation: str, table: str, message: str, original_error: Exception = None):
super().__init__(
f"Database operation '{operation}' on table '{table}' failed: {message}",
"DATABASE_ERROR",
{"operation": operation, "table": table, "original_error": str(original_error) if original_error else None}
)
self.operation = operation
self.table = table
self.original_error = original_error
class NetworkError(ApplicationError):
"""Exception raised when network operations fail."""
def __init__(self, url: str, method: str, status_code: int = None, message: str = None):
super().__init__(
message or f"Network request to '{url}' using {method} failed",
"NETWORK_ERROR",
{"url": url, "method": method, "status_code": status_code}
)
self.url = url
self.method = method
self.status_code = status_code
class ErrorHandler:
"""Centralized error handling and logging system."""
def __init__(self):
self.error_counts: Dict[str, int] = {}
self.error_history: List[Dict[str, Any]] = []
self.max_history_size = 1000
self.recovery_strategies: Dict[str, Callable] = {}
def handle_error(self, error: Exception, context: Dict[str, Any] = None) -> Dict[str, Any]:
"""Handle an error and return error information."""
# Log the error
self._log_error(error, context)
# Update error counts
error_type = error.__class__.__name__
self.error_counts[error_type] = self.error_counts.get(error_type, 0) + 1
# Add to history
self._add_to_history(error, context)
# Try to recover if possible
recovery_result = self._attempt_recovery(error, context)
# Return error information
return {
"error": error,
"context": context or {},
"recovery_attempted": recovery_result is not None,
"recovery_result": recovery_result,
"error_count": self.error_counts[error_type],
"timestamp": datetime.now().isoformat()
}
def _log_error(self, error: Exception, context: Dict[str, Any] = None):
"""Log error with context information."""
if isinstance(error, ApplicationError):
error_info = error.to_dict()
logger.error(f"Application error: {error_info}")
else:
logger.error(f"System error: {str(error)}", exc_info=True)
if context:
logger.error(f"Error context: {json.dumps(context, indent=2, default=str)}")
def _add_to_history(self, error: Exception, context: Dict[str, Any] = None):
"""Add error to history, maintaining size limit."""
error_record = {
"timestamp": datetime.now().isoformat(),
"error_type": error.__class__.__name__,
"message": str(error),
"context": context or {}
}
self.error_history.append(error_record)
# Maintain history size
if len(self.error_history) > self.max_history_size:
self.error_history.pop(0)
def _attempt_recovery(self, error: Exception, context: Dict[str, Any] = None) -> Any:
"""Attempt to recover from the error using registered strategies."""
error_type = error.__class__.__name__
if error_type in self.recovery_strategies:
try:
recovery_func = self.recovery_strategies[error_type]
result = recovery_func(error, context)
logger.info(f"Recovery strategy executed for {error_type}: {result}")
return result
except Exception as recovery_error:
logger.error(f"Recovery strategy failed for {error_type}: {recovery_error}")
return None
return None
def register_recovery_strategy(self, error_type: str, strategy: Callable):
"""Register a recovery strategy for a specific error type."""
self.recovery_strategies[error_type] = strategy
logger.info(f"Recovery strategy registered for {error_type}")
def get_error_statistics(self) -> Dict[str, Any]:
"""Get statistics about errors."""
total_errors = sum(self.error_counts.values())
return {
"total_errors": total_errors,
"error_counts": self.error_counts.copy(),
"history_size": len(self.error_history),
"recovery_strategies": list(self.recovery_strategies.keys())
}
def clear_history(self):
"""Clear error history."""
self.error_history.clear()
logger.info("Error history cleared")
class DebugHelper:
"""Utility class for debugging and troubleshooting."""
@staticmethod
def inspect_object(obj: Any, max_depth: int = 3, current_depth: int = 0) -> Dict[str, Any]:
"""Inspect an object and return detailed information."""
if current_depth >= max_depth:
return {"type": type(obj).__name__, "value": str(obj), "max_depth_reached": True}
try:
if hasattr(obj, '__dict__'):
# Object with attributes
result = {
"type": type(obj).__name__,
"module": getattr(obj, '__module__', 'Unknown'),
"attributes": {}
}
for attr_name, attr_value in obj.__dict__.items():
if not attr_name.startswith('_'):
result["attributes"][attr_name] = DebugHelper.inspect_object(
attr_value, max_depth, current_depth + 1
)
return result
elif isinstance(obj, (list, tuple)):
# Sequence types
return {
"type": type(obj).__name__,
"length": len(obj),
"items": [DebugHelper.inspect_object(item, max_depth, current_depth + 1)
for item in obj[:5]] # Limit to first 5 items
}
elif isinstance(obj, dict):
# Dictionary
return {
"type": type(obj).__name__,
"length": len(obj),
"keys": list(obj.keys())[:10], # Limit to first 10 keys
"sample_values": {k: DebugHelper.inspect_object(v, max_depth, current_depth + 1)
for k, v in list(obj.items())[:5]}
}
else:
# Simple types
return {
"type": type(obj).__name__,
"value": str(obj),
"repr": repr(obj)
}
except Exception as e:
return {
"type": type(obj).__name__,
"error": f"Inspection failed: {str(e)}"
}
@staticmethod
def get_call_stack() -> List[Dict[str, str]]:
"""Get the current call stack information."""
stack = traceback.extract_stack()
return [
{
"filename": frame.filename,
"line_number": frame.lineno,
"function_name": frame.name,
"line_content": frame.line
}
for frame in stack[:-1] # Exclude this function
]
@staticmethod
def measure_execution_time(func: Callable, *args, **kwargs) -> Dict[str, Any]:
"""Measure execution time and memory usage of a function."""
import time
import psutil
start_time = time.time()
start_memory = psutil.Process().memory_info().rss if 'psutil' in sys.modules else 0
try:
result = func(*args, **kwargs)
execution_time = time.time() - start_time
end_memory = psutil.Process().memory_info().rss if 'psutil' in sys.modules else 0
return {
"success": True,
"result": result,
"execution_time": execution_time,
"memory_delta": end_memory - start_memory,
"start_memory": start_memory,
"end_memory": end_memory
}
except Exception as e:
execution_time = time.time() - start_time
return {
"success": False,
"error": str(e),
"execution_time": execution_time,
"memory_delta": 0
}
class DataValidator:
"""Data validation with comprehensive error reporting."""
def __init__(self):
self.validation_rules = {}
self.custom_validators = {}
def add_rule(self, field: str, rule: str, validator: Callable, message: str = None):
"""Add a validation rule for a field."""
if field not in self.validation_rules:
self.validation_rules[field] = []
self.validation_rules[field].append({
"rule": rule,
"validator": validator,
"message": message
})
def validate(self, data: Dict[str, Any]) -> List[ValidationError]:
"""Validate data against all registered rules."""
errors = []
for field, rules in self.validation_rules.items():
if field in data:
value = data[field]
for rule_info in rules:
try:
if not rule_info["validator"](value):
message = rule_info["message"] or f"Validation failed for {field}"
errors.append(ValidationError(field, value, rule_info["rule"], message))
except Exception as e:
errors.append(ValidationError(field, value, rule_info["rule"], f"Validator error: {e}"))
else:
# Field is missing
errors.append(ValidationError(field, None, "required", f"Field '{field}' is required"))
return errors
# Example usage and demonstration
def demonstrate_error_handling():
"""Demonstrate the comprehensive error handling system."""
print("=== Comprehensive Error Handling Demo ===\n")
# Initialize error handler
error_handler = ErrorHandler()
# Register recovery strategies
def recover_from_validation_error(error: ValidationError, context: Dict[str, Any]) -> str:
"""Recovery strategy for validation errors."""
if error.field == "email" and error.rule == "format":
return f"Attempting to fix email format: {error.value}"
return "No recovery strategy available"
error_handler.register_recovery_strategy("ValidationError", recover_from_validation_error)
# Initialize data validator
validator = DataValidator()
# Add validation rules
validator.add_rule("email", "format", lambda x: '@' in str(x), "Email must contain @ symbol")
validator.add_rule("age", "range", lambda x: 0 <= int(x) <= 120, "Age must be between 0 and 120")
validator.add_rule("name", "length", lambda x: len(str(x)) >= 2, "Name must be at least 2 characters")
# Test data validation
print("1. Data Validation with Error Handling:")
test_data = {
"email": "invalid-email",
"age": 150,
"name": "A"
}
try:
validation_errors = validator.validate(test_data)
if validation_errors:
print(f" Validation errors found: {len(validation_errors)}")
for error in validation_errors:
print(f" - {error}")
# Handle the error
error_info = error_handler.handle_error(error, {"data": test_data})
print(f" Error handled: {error_info['recovery_result']}")
else:
print(" All data is valid!")
except Exception as e:
print(f" Validation failed: {e}")
# Test custom exceptions
print("\n2. Custom Exception Handling:")
try:
# Simulate a database error
raise DatabaseError("SELECT", "users", "Connection timeout", ConnectionError("Connection failed"))
except DatabaseError as e:
error_info = error_handler.handle_error(e, {"operation": "user_query"})
print(f" Database error handled: {error_info['error']}")
# Test debugging utilities
print("\n3. Debugging Utilities:")
# Inspect an object
sample_object = {
"name": "John Doe",
"age": 30,
"skills": ["Python", "JavaScript", "SQL"]
}
inspection = DebugHelper.inspect_object(sample_object)
print(f" Object inspection: {json.dumps(inspection, indent=2)}")
# Get call stack
stack = DebugHelper.get_call_stack()
print(f" Call stack depth: {len(stack)}")
# Measure execution time
def sample_function():
import time
time.sleep(0.1)
return "Function completed"
timing = DebugHelper.measure_execution_time(sample_function)
print(f" Function timing: {timing}")
# Get error statistics
print("\n4. Error Statistics:")
stats = error_handler.get_error_statistics()
print(f" Total errors: {stats['total_errors']}")
print(f" Error counts: {stats['error_counts']}")
print(f" Recovery strategies: {stats['recovery_strategies']}")
def demonstrate_error_recovery():
"""Demonstrate error recovery mechanisms."""
print("\n=== Error Recovery Demonstration ===\n")
error_handler = ErrorHandler()
# Register a recovery strategy for network errors
def retry_network_request(error: NetworkError, context: Dict[str, Any]) -> str:
"""Retry network request with exponential backoff."""
import time
max_retries = context.get('max_retries', 3)
current_retry = context.get('current_retry', 0)
if current_retry < max_retries:
wait_time = 2 ** current_retry
print(f" Retrying in {wait_time} seconds... (attempt {current_retry + 1}/{max_retries})")
time.sleep(wait_time)
return f"Retry attempt {current_retry + 1} completed"
else:
return "Max retries exceeded"
error_handler.register_recovery_strategy("NetworkError", retry_network_request)
# Simulate network errors with recovery
print("1. Network Error Recovery:")
for attempt in range(3):
try:
# Simulate network failure
raise NetworkError("https://api.example.com", "GET", 500, "Internal server error")
except NetworkError as e:
context = {"max_retries": 3, "current_retry": attempt}
error_info = error_handler.handle_error(e, context)
print(f" Attempt {attempt + 1}: {error_info['recovery_result']}")
if "Max retries exceeded" in error_info['recovery_result']:
break
print("\n2. Error History Analysis:")
# Show error history
stats = error_handler.get_error_statistics()
print(f" Total errors in session: {stats['total_errors']}")
print(f" Error types encountered: {list(stats['error_counts'].keys())}")
# Clear history
error_handler.clear_history()
print(" Error history cleared")
if __name__ == "__main__":
demonstrate_error_handling()
demonstrate_error_recovery()