Menu
Module 7 / 8 Advanced 18 min read

Aggregates & Repositories

The most conceptually important — and hardest to recover — tactical pattern: a cluster of objects with one entry point, one consistency boundary, and exactly one repository.

In this module

Define the Aggregate and its root, learn why a transaction should touch exactly one Aggregate, and use the one-repository-per-root rule to confirm Aggregate boundaries.

A cluster, with one door in

An Aggregate is a cluster of associated Entities and Value Objects treated as a single unit for the purpose of data changes and consistency. One Entity in the cluster — the Aggregate Root — is the only member that code outside the Aggregate is allowed to hold a reference to. Every access to anything inside the boundary goes through the root. The Aggregate has a consistency boundary: the business rules that must always hold — its invariants — are enforced together, inside one transaction, by the root.

In the bookstore, Order is the Aggregate Root, and OrderLine (the Value Object from Module 6) lives entirely inside its boundary. Nothing outside Sales ever holds an OrderLine directly — code that wants to know an order’s lines asks the Order for them.

final class Order {
    private final OrderId id;
    private final CustomerId customerId;
    private OrderStatus status;
    private final List<OrderLine> lines;

    static Order place(CustomerId customerId, List<OrderLine> lines) {
        if (lines.isEmpty()) {
            throw new IllegalArgumentException("An order must have at least one line");
        }
        return new Order(OrderId.generate(), customerId, OrderStatus.PLACED, List.copyOf(lines));
    }

    void addLine(OrderLine line) {
        if (status != OrderStatus.PLACED) {
            throw new IllegalStateException("Cannot modify an order after it leaves PLACED");
        }
        lines.add(line);
    }

    Money total() {
        return lines.stream()
            .map(OrderLine::lineTotal)
            .reduce(Money::add)
            .orElseThrow();
    }
}

This is the most conceptually important tactical pattern, and the hardest to recover from existing code, precisely because the boundary is almost never written down explicitly anywhere — you have to infer it. Get it right and the whole model’s shape clarifies; get it wrong and you’ll find yourself enforcing the same invariant in three unrelated places, none of which fully trusts the others.

What the root actually does

The root isn’t a formality — it’s where the invariants live. addLine above doesn’t just append to a list; it refuses to do so once the order has left the PLACED state, because allowing lines to be added to a shipped order would violate a rule the business actually cares about. total() always recomputes from the current lines rather than trusting a cached field that could drift out of sync. This is the pattern to look for: methods on the root that maintain a rule spanning multiple members of the cluster, not just getters and setters delegating downward.

Characteristics to look for:

  • A root Entity controlling access to the whole cluster; internal members are reached only through it.
  • The root enforcing invariants that span its members (addLine re-checking order status; total() staying consistent with the current lines).
  • External references point to other Aggregates by id, not by direct object reference. Order holds a CustomerId, not a Customer object — even though a Customer aggregate might exist in another context entirely. This keeps a transaction scoped to modifying one Aggregate instance at a time, which is what actually makes the consistency boundary meaningful at the database level.
  • Cascading persistence from root to children (cascade = ALL, orphanRemoval = true in JPA) is a common implementation signal, though plenty of real codebases hold cross-aggregate references directly for pragmatic reasons — treat its absence as one signal among several, not proof on its own.

A transaction should modify exactly one Aggregate instance. If your code is about to save changes to two Aggregate Roots in the same transaction, that's usually a sign the boundary is drawn wrong — or that you need a Domain Event instead, which Module 8 covers.

Repository: the illusion of an in-memory collection

A Repository provides the illusion of an in-memory collection of all the Aggregates of one type. It mediates between the domain and whatever actually stores the data, exposing collection-like operations — add, save, remove, findById — while hiding every persistence detail behind a domain-meaningful interface.

interface OrderRepository {
    void save(Order order);
    Optional<Order> findById(OrderId id);
    List<Order> findByCustomer(CustomerId customerId);
}

The rule that makes Repositories useful as a diagnostic tool, not just a data-access pattern, is this: there is exactly one Repository per Aggregate Root — never one per Entity. OrderLine does not get an OrderLineRepository. It can’t be loaded, saved, or queried on its own, because it has no existence outside the Order it belongs to.

This cuts both ways, which is what makes it so useful:

  • If you’ve already identified an Aggregate Root, it should have exactly one Repository. No Repository at all is a sign you haven’t wired up persistence yet, or that you misidentified the root.
  • If you find a Repository, the type it returns is almost certainly an Aggregate Root — even if you hadn’t consciously decided that yet. This is one of the single strongest signals available for finding Aggregate boundaries in code you didn’t write: go look at what has a *Repository interface, especially a Spring Data extends JpaRepository<Root, Id>, and you’ve found your roots.

Note. Watch for the inverse smell: a Repository for something that looks like it belongs inside another Aggregate — an OrderLineRepository, say. That’s a real tension worth naming rather than ignoring. Either the type genuinely deserves to be its own Aggregate after all (maybe OrderLines really do get queried and modified independently in this codebase), or the boundary has been violated and code somewhere is reaching around Order to touch its internals directly. Both are worth fixing — but they’re different fixes, so don’t guess; trace the call sites.

You can now identify the two halves of every Aggregate — the Entities and Value Objects inside it, and the Repository that’s the only sanctioned way in from outside. The next module covers everything that doesn’t fit neatly inside one Aggregate: operations that span several of them, facts about what already happened, and the disciplined construction of complex objects in the first place.

Next: Module 8 — Domain Services, Events & the Whole Picture.