Pydantic is a data validation and settings/config management library for python. It makes sense to use pydantic when we have data schemas in our codebase (called "models" in pydantic terms) such as a POJO like class in python. But if we're simply trying to validate types in our functions etc, we can just use mypy.
python -m pip install "pydantic[email]" - This is how to install pydantic with optional dependencies.
python -m pip install pydantic-settings - This is the sibling library that is used for config management.
Whats the difference between Pydantic and python dataclasses
Pydantic’s primary way of defining data schemas is through models. A Pydantic model is an object, similar to a Python dataclass, that defines and stores data about an entity with annotated fields (annotated fields are the ones with type hints such as name: str). Unlike dataclasses, Pydantic’s focus is centered around automatic data parsing, validation, and serialization.
See this repo for demo code and setup for using pydantic + mypy + pre-commit. Also this.
How to use Pydantic’s BaseModel to validate and serialize your data
Below is an example to create pydantic models
from uuid import UUID, uuid4
from pydantic import BaseModel
class Employee(BaseModel):
employee_id: UUID = uuid4()
name: str
date_of_birth: date
salary: float
elected_benefits: bool
Pydantic validates the fields when an Employee object is instantiated. Pydantic successfully validates and coerces the fields you passed in, and it creates a valid Employee object.
Something to note here is that employee_id, name etc are all instance variables and not class variables even though it looks like class variables.
If you wanted to define a true class-level variable (i.e., something not meant to be part of the model’s data), you’d do this:
from typing import ClassVar
class MyModel(BaseModel):
name: str
description: ClassVar[str] = "This is a model"
Now, you can do the below to instantiate an Employee object from a dictionary.
new_employee_dict = {
"name": "Chris DeTuma",
"date_of_birth": "1998-04-02",
"salary": 123_000.00,
"elected_benefits": True,
}
another_chris = Employee.model_validate(new_employee_dict)
You can do the same thing with JSON objects using .model_validate_json() - see this
This is one of the reasons why FastAPI relies on Pydantic to create REST APIs.
You can also serialize Pydantic models as dictionaries and JSON - see this
Create a JSON schema from your Employee model - see this
More customized data validation
The Field class allows you to customize and add metadata to your model’s fields.
from datetime import date
from uuid import UUID, uuid4
from pydantic import BaseModel, EmailStr, Field
class Employee(BaseModel):
employee_id: UUID = Field(default_factory=uuid4, frozen=True)
name: str = Field(min_length=1, frozen=True)
email: EmailStr = Field(pattern=r".+@example\.com$")
date_of_birth: date = Field(alias="birth_date", repr=False, frozen=True)
salary: float = Field(alias="compensation", gt=0, repr=False)
elected_benefits: bool
Read this to learn more about what frozen, alias etc in Field mean