Validation
In this section
- Validation (this page)
genesis_self_check
Callback — writing a function to control access to a networkvalidate
Callback — basic callback, examples using stub functions- DHT operations — advanced details on the underlying data structure used in DHT replication and validation
Validation gives shape to your DNA’s data model. It defines the ‘rules of the game’ for a network — who can create, modify, or delete data, and what that data should and shouldn’t look like. It’s also the basis for Holochain’s peer-auditing security model.
You implement your validation logic in your application’s integrity zomes. A DNA uses validation logic in two ways:
- By an author of data, to protect them from publishing invalid data, and
- By an agent that’s received data to store and serve, to equip them to detect invalid data and take action against the author.
Because every peer has the DNA’s validation logic on their own machine and is expected to check the data they author before they publish it, invalid data is treated as an intentionally malicious act.
Info
Currently Holochain can inform agents about invalid data when asked. In the future it’ll also take automatic defensive action by putting a malicious author into an agent’s network block list when they see evidence of invalid data.
There are two callbacks that implement validation logic:
validate
is the core of the zome’s validation logic. It receives a DHT operation, which is a request to transform the data at an address, and returns a success/failure/indeterminate result.genesis_self_check
‘pre-validates’ an agent’s own membrane proof before trying to connect to peers in the network.
Design considerations
Validation is a broad topic, so we won’t go into detail here. There are a few basic things to keep in mind though:
The structure of the
Op
type that avalidate
callback receives is complex and deeply nested, and it’s best to let the scaffolding tool generate the callback for you. It generates stub functions that let you think in terms of actions rather than operations, which is more natural and good enough for most needs. Read all about DHT operations if you want deep detail.Entry data, link tags, and membrane proofs are just blobs; they need to be parsed in order to check that they have the correct structure. (The HDK makes it easy to deserialize an entry blob into a Rust type though.)
While an entry or link can be thought of as ‘things’, the actions that create, update, or delete them are verbs. Validating a whole action lets you not just check the content and structure of your things, but also enforce write privileges and even throttle an agent’s frequency of writes by looking at the action’s place in their source chain.
Validation rules must always yield the same true/false outcome for a given operation regardless of who is validating them and when. Don’t use any source of non-determinism, such as instantiating and comparing two
std::time::Instant
s. In fact Holochain prevents your validation callbacks from calling any non-deterministic host functions. Read more about the available host functions.Data may have dependencies that affect validation outcomes, but those dependencies must be addressable, they must be retrievable from the same DHT, and their addresses must be known. If a dependency can’t be retrieved at validation time, the
validate
callback terminates early with an indeterminate result, which will cause Holochain to try again later. (Note that an action already has a dependency on the action preceding it on an agent’s source chain.)Even though multiple actions can be written within an atomic transaction, they are not validated together as an atomic transaction. An action can only have dependencies on prior actions in a source chain, not subsequent actions.
You don’t need to validate your data manually before committing — Holochain validates it after the zome function that writes it is finished, and returns any validation failure to the caller.
Test, test, test. Validation is the gate that accepts or rejects all DHT data, so make sure you write thorough test coverage for your validation functions. If the data being validated has no dependencies on DHT data or DNA/zome info, we recommend writing Rust unit tests for the validation function stubs that the scaffolding tool generates. We also recommend testing your validation code by writing single- and multiple-agent Tryorama test scenarios for zome functions that write data. This lets you check that your validation rules pass both when authoring data and checking data authored by other agents. (We’ll write about Tryorama soon; in the meantime, you can check the Tryorama GitHub readme and the scaffolded tests in a project’s
tests/src/
folder for guidance).
Things you don’t need to worry about
- For dependency trees that might get complex and costly to retrieve, you can use inductive validation rather than having to retrieve and validate all the dependencies.
- Action timestamps, sequence indices, and authorship are automatically checked for consistency against the previous action in the author’s source chain.
- Data is checked against Holochain’s maximum size (4 MB for entries, 1 KB for link tags).
- The entry type of
Update
actions is checked against the data they replace. - The scaffolding tool generates a sensible default
validate
callback that does these things for you:- Tries to deserialize an entry into the correct Rust type, and returns a validation failure if it fails.
- Checks that the original entry for an
Update
orDelete
action exists and is a valid entry creation action. - Checks that the original entry for an
Update
contains the same entry type. - Checks that the original entry for a
Delete
comes from the same integrity zome. - Checks that the action that registers the agent’s public key is directly preceded by an
AgentValidationPkg
action. - Checks that most-recent update links and collection links point to valid entry creation records.
- Tries to fetch data dependencies from the DHT and make sure they’re the right type.
Available host functions
As mentioned, any host functions that introduce non-determinism can’t be called from genesis_self_check
or validate
— Holochain will return an error. That includes functions that create data or check the current time or agent info, of course, but it also includes certain functions that retrieve DHT or source chain data. That’s because this data can change over time.
These functions are available to both validate
and genesis_self_check
:
validate
can also call these deterministic DHT retrieval functions:
must_get_action
tries to get an action from the DHT. (It’s not guaranteed that the action will be valid.)must_get_agent_activity
tries to get a contiguous section of a source chain, starting from a given record and walking backwards to another spot (either the beginning of the chain, a number of records, or one of a number of given hashes).must_get_entry
tries to get an entry from the DHT. (As withmust_get_action
, it’s not guaranteed that the entry will be valid.)must_get_valid_record
tries to get a record, and will fail if the record is marked invalid by any validators, even if it can be found. This makes inductive validation possible.
All of these functions cause a validate
callback to terminate early with ValidateCallbackResult::UnresolvedDependencies(UnresolvedDependencies)
.