Understanding JSON → Python
Three good answers, one question.
dict, dataclass, or Pydantic — the right choice depends on whether the data is being read, written, or validated.
Option 1: plain dict, no codegen.
For a script that's read once and discarded, json.loads(s) returns a dict and that's the end of it. No type system, no validation, attribute access is bracketed. This is Python's strength as much as its weakness: throwaway code can stay throwaway. The cost shows up when the dict outlives the script — you're guessing about keys, the IDE can't autocomplete, a typo in a key name silently returns None.
Option 2: dataclasses.
Standard library since Python 3.7. @dataclass produces an attrs-style record class with type hints, attribute access, dataclass equality, and zero runtime dependencies. from_dict is your responsibility — the standard library provides nothing automatic, so codegen tools either emit a hand-rolled constructor or lean on dacite/cattrs to bridge JSON. Use dataclasses when you want named-attribute access and type hints but don't need runtime validation.
@dataclass: types ✓, validation ✗
Option 3: Pydantic.
The de-facto choice for data models that need to be validated as they enter the system. A Pydantic BaseModel parses, coerces, and validates JSON in one step; if the input is malformed or has wrong types it raises ValidationError with a structured list of every problem. Pydantic v2 is fast — written in Rust under the hood — and powers FastAPI's request validation. Use it whenever the data is crossing a trust boundary (HTTP body, message queue, file upload, environment variables).
BaseModel: types ✓, validation ✓, errors ✓
A worked transformation.
From { "id": 7, "name": "Q", "tags": ["a"] } a dataclass output is @dataclass class T: id: int; name: str; tags: list[str]. The Pydantic version is structurally identical but inherits from BaseModel instead; you create instances with T.model_validate(payload) and get type errors translated into useful messages. The plain-dict route is just data = json.loads(s) — fast, fragile.
dataclass form
standard-library only
Type hints help the IDE; runtime is permissive.
@dataclass class T: id: int; name: str; tags: list[str]
= Lightweight container
Pydantic form
class T(BaseModel)
Same fields; validation happens on every parse.
T.model_validate(json.loads(s))
= Validated instance or ValidationError
Optional fields.
For dataclasses, name: str | None = None makes the field optional and defaults it to None. For Pydantic, the same syntax works and validation accepts either a string or null. Optional[str] from typing is the older spelling — equivalent — but the pipe-union form is the modern Python 3.10+ idiom.
Snake_case is already Python's convention.
PEP 8 prescribes snake_case attribute names, which usually aligns with JSON APIs that also use snake_case. When they don't — Pydantic offers Field(alias="...") and a model-level model_config = ConfigDict(populate_by_name=True) to accept both the alias and the Python attribute name. Dataclasses don't have this built in; you'd hand-roll a from-dict that does the rename.
Dates and decimals.
Pydantic parses ISO 8601 date strings into datetime objects when the field is typed that way — no extra work. It similarly handles Decimal, UUID, Path, and a growing list of standard-library types. The dataclass equivalent is hand-rolled and tedious; this difference is often the deciding factor for "should I add Pydantic to this project?" — yes if you have any dates or decimals worth a second thought.