Skip to content

Advanced OOP & Idioms

🎨 Advanced Python OOP & Idioms

Mastering these features ensures your Data Engineering code is clean, reusable, and type-safe.


🛠️ 1. Decorators

Decorators allow you to wrap a function with additional logic. Essential for logging, retry logic, and timing metrics.

from functools import wraps
import time

def timeit(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
        start = time.perf_counter()
        result = func(*args, **kwargs)
        end = time.perf_counter()
        print(f"{func.__name__} took {end - start:.4f}s")
        return result
    return wrapper

@timeit
def process_data(data):
    # Process complex data here
    pass

📦 2. Context Managers (with)

Context managers ensure resources (like database connections or files) are properly closed, even if an error occurs.

from contextlib import contextmanager

@contextmanager
def db_transaction(connection):
    cursor = connection.cursor()
    try:
        yield cursor
        connection.commit()
    except Exception:
        connection.rollback()
        raise
    finally:
        cursor.close()

🏷️ 3. Modern Type Hinting (Pydantic)

Python is dynamically typed, but modern engineering requires static type safety. Pydantic is the industry standard for data validation.

from pydantic import BaseModel, Field

class UserProfile(BaseModel):
    id: int
    name: str = Field(min_length=2)
    email: str
    is_active: bool = True

# Automatic validation on instantiation
user = UserProfile(id=1, name="Alice", email="alice@example.com")
print(user.model_dump_json())

🏗️ 4. Protocols (Static Duck Typing)

Protocols let you define an interface without requiring explicit inheritance.

from typing import Protocol

class DataProcessor(Protocol):
    def process(self, data: dict) -> dict:
        ...

def execute_pipeline(processor: DataProcessor, data: dict):
    return processor.process(data)