Message Versioning

Evolving message contracts in distributed clusters

Distributed systems evolve. Services gain features, data models change, and deployments happen gradually. During a rolling upgrade, some nodes run new code while others still run the old version. A message sent from a new node must be understood by an old node, and vice versa.

EDF serializes messages by their exact Go type. Change a struct - and you have a new, incompatible type. This is intentional: explicit versioning catches breaking changes at compile time rather than hiding them until production.

This article explains how to version messages so your cluster handles upgrades gracefully.

Explicit Versioning

Unlike Protobuf or Avro, EDF does not provide automatic backward compatibility. There are no optional fields, no field numbers, no schema evolution. A struct is its type. Change the struct - create a new type.

The approach is straightforward: create a new type for each version.

// Version 1
type OrderCreatedV1 struct {
    OrderID int64
}

// Version 2 - new field
type OrderCreatedV2 struct {
    OrderID  int64
    Priority int
}

Both types coexist in the codebase. The receiver handles whichever version arrives:

func (a *Actor) HandleMessage(from gen.PID, message any) error {
    switch m := message.(type) {
    case OrderCreatedV1:
        return a.handleOrderV1(m)
    case OrderCreatedV2:
        return a.handleOrderV2(m)
    }
    return nil
}

All message types must be registered with EDF before connection establishment:

For details on EDF and type registration, see Network Transparency.

Versioning Strategies

There are two ways to organize versioned types: version in the type name or version in the package path. Both work with EDF. Choose based on your team's preferences.

Important: Do not confuse package path versioning with Go modules v2+. Go modules v2+ requires changing both go.mod and all import paths when bumping major version (company.com/events/v2). This forces all consumers to update imports simultaneously, creates diamond dependency problemsarrow-up-right, and generally causes more pain than it solves. Keep your module below v2.0.0 to avoid triggering this mechanism.

Version in Type Name

All versions live in the same package:

Handler uses type names directly:

Advantages:

  • Single import for all versions

  • All versions visible in one place - evolution is clear

  • One registration file for all types

  • Simpler directory structure

Version in Package Path

Each version is a separate package:

Handler uses package aliases:

Advantages:

  • Clean type names without version suffix

  • Familiar to Protobuf users

  • Clear directory separation between versions

  • Removing a version means deleting a directory

Module Organization

For projects where message versions evolve in parallel, place go.mod in each domain directory:

The /v1/ and /v2/ segments are in the middle of the module path, not at the end. Go only applies v2+ import path requirements when /vN is the final path element, so company.com/messaging/v1/events is safe.

This structure allows:

  • V1 to continue receiving new message types while V2 is developed

  • Each domain to have isolated dependencies

  • Clean removal - deleting a directory removes the module entirely

Tagging submodules: Git tags for nested modules must include the path prefix. For module company.com/messaging/v1/events located at v1/events/, use tag v1/events/v0.1.0, not just v0.1.0.

Which to Choose

This documentation uses version in type name for examples. The approach keeps related versions together and requires less import management. However, version in path is equally valid if your team prefers cleaner type names.

Whichever you choose, stay consistent across the codebase.

The versioning mechanism is clear. The next question: where should these types live, and who controls their evolution?

Message Scopes

The answer depends on how the message is used. Not all messages are equal - some travel between two specific services, others broadcast across the entire cluster.

Private Messages

Direct communication between specific services. Request/response patterns between known parties.

spinner

Owner: receiver

Payment Service defines what it accepts. Order Service adapts to Payment's contract.

Cluster-Wide Events

Domain events published to multiple subscribers. Any service can subscribe.

spinner

Owner: shared repository

Events represent domain facts, not service-specific contracts. Ownership belongs to a shared module that all services import.

For event publishing patterns, see Events.

Ownership Rules

Scope determines ownership. Who decides when to create V2? Who approves changes?

Scope
Owner
Module
Changes approved by

Private messages

Receiver

receiver-api/

Receiver team

Cluster-wide events

Shared

events/

All consumers

The receiver owns private contracts because it implements the logic. Multiple senders may use the same contract, but they all adapt to what the receiver accepts. This follows the Consumer-Driven Contractsarrow-up-right pattern. Events are shared because they represent domain facts, not service-specific APIs.

Private Contract Ownership

Payment Service owns its API contract:

Order Service imports and uses it:

Payment team decides when to create V2. Order team adapts.

Cluster Event Ownership

Events require broader coordination:

Breaking changes require sign-off from all consumers.

Repository Organization

With ownership defined, the repository structure follows naturally. Private contracts live with their receivers. Cluster-wide events live in a shared module.

Version in Type Name

Version in Package Path

Registration Helper

All message types must be registered with EDF before connection establishment - during handshake, nodes exchange their registered type lists which become the encoding dictionaries. Registration typically happens in init() functions before node startup. There are two approaches: centralized registration in the shared module or manual registration in each client.

Centralized registration uses init() to register all types when the package is imported:

When clients import the package to use message types, init() runs automatically at program startup and registers all types:

No risk of forgetting a type.

Manual registration means each client registers only the types it uses. This gives more control but introduces risk: a missing registration is only detected at runtime - "no encoder for type" when sending, "unknown reg type for decoding" when receiving. For most projects, centralized registration is simpler and safer. Choose based on your needs.

For message isolation patterns within a single codebase, see Project Structure.

Compatibility Rules

EDF enforces strict type identity. Any struct change breaks wire compatibility.

Change
Compatible
Action

Add field

No

Create new version

Remove field

No

Create new version

Change field type

No

Create new version

Rename field

No

Create new version

Reorder fields

No

Create new version

This differs from Protobuf/Avro where adding optional fields is compatible. In EDF, every change requires explicit versioning.

Yes, this means more work upfront. But consider the alternative: Protobuf lets you add an optional Priority field, and everything "just works" - until you spend three days debugging why orders aren't prioritized correctly. Turns out half your cluster sends the new field, half ignores it, and the receivers silently default missing values to zero. Good luck finding that in logs.

EDF makes this impossible. The receiver either handles OrderV2 with its Priority field, or it doesn't - and you know this at compile time, not at 3 AM when on-call.

Version Lifecycle

With compatibility rules clear, how do versions evolve over time?

When to Create New Version

Any change from the compatibility table above requires a new version. Additionally, create a new version when changing field semantics (same type, different meaning).

Deprecation

Mark deprecated versions:

Log when receiving deprecated versions:

Removal

Remove only when:

  1. All senders upgraded to V2

  2. Monitoring confirms zero V1 traffic

  3. Deprecation period passed

Remove in order:

  1. Stop accepting (return error for V1)

  2. Remove from registration

  3. Delete type definition

Rolling Upgrades

Back to the scenario from the introduction: you're deploying a new version, nodes restart one by one, and for some time the cluster runs mixed code versions. How do you handle this?

Upgrade Strategy

  1. Deploy V2 types to shared module

  2. Update receivers to handle V1 and V2

  3. Rolling restart receiver nodes

  4. Update senders to send V2

  5. Rolling restart sender nodes

  6. Deprecate V1 after all nodes upgraded

  7. Remove V1 after deprecation period

Coexistence Period

spinner

Receivers must support both versions during the upgrade window.

For deployment patterns with weighted routing, see Building a Cluster.

Anti-Corruption Layer

Supporting multiple versions means your handler has multiple code paths. As versions accumulate, this becomes messy. The Anti-Corruption Layer pattern isolates version translation:

Use in handler:

Single implementation handles V2. ACL converts V1 to V2. When V1 is removed, delete the ACL function - no changes to business logic needed.

Contract Testing

With version handling and ACL in place, how do you verify it actually works? Contract testsarrow-up-right verify compatibility:

Test ACL conversion:

Run contract tests in CI before merging changes to shared modules.

For actor testing patterns, see Unit Testing.

Naming Conventions

Consistent naming makes code self-documenting. When you see a type name, you should immediately know: is this async or sync? Is it a request or event? What version?

Async Messages

Prefix with Message, suffix with version:

The prefix signals fire-and-forget semantics. When reading code, MessageXXX means no response is expected. If someone writes Call(pid, MessageOrderShippedV1{}), the mismatch is immediately visible.

Sync Messages

Use Request/Response suffix:

Paired naming makes contracts explicit. ChargeRequest implies ChargeResponse exists. The caller knows to expect a result.

Events

Domain events use past tense without prefix:

Events describe facts that already happened, not requests for action. Past tense (Created, Received) distinguishes them from commands (Create, Charge).

Version Suffix

If using version in type name strategy, always suffix with version number:

If using version in path strategy, the package path carries the version and type names stay clean.

Common Mistakes

These patterns emerge repeatedly in production systems. Avoid them:

Changing existing type instead of creating new version

Forgetting to register new types

Long coexistence periods

Supporting V1 for months creates maintenance burden. Set clear deprecation deadlines and enforce them.

Registering after connection established

Types must be registered before node starts. Dynamic registration requires connection cycling.

Summary

Message versioning in EDF is explicit by design. No hidden compatibility rules, no runtime surprises.

Aspect
Private Messages
Cluster Events

Nature

Service API contract

Domain fact

Owner

Receiver (implements logic)

Shared (belongs to domain)

Module

receiver-api/

events/

Changes

Receiver team decides

All consumers coordinate

Key principles:

  • Version in type name or package path, never in Go module path

  • Receiver owns private contracts

  • Shared repository for domain events

  • Test version compatibility

  • Set deprecation deadlines

  • Use ACL to isolate version translation

Last updated