Working With Data
Holochain is, at its most basic, a framework for building graph databases on top of content-addressed storage that are validated and stored by networks of peers. Each peer contributes to the state of this database by publishing actions to an event journal stored on their device called their source chain. The source chain can also be used to hold private state.
Entries, actions, and records: primary data
Data in Holochain takes the shape of a record. Different kinds of records have different purposes, but the thing common to all records is the action: one participant’s intent to manipulate their own state or the application’s shared database state in some way. All actions contain:
- The agent ID of the author
- A timestamp
- The type of action
- The hash of the previous action in the author’s history of state changes, called their source chain (note: the first action in their chain doesn’t contain this field, as it’s the first)
- The index of the action in the author’s source chain, called the action seq
Some actions also contain a weight, which is a calculation of the cost of storing the action and can be used for rate limiting. (Note: weighting and rate limiting isn’t implemented yet.)
The other important part of a record is the entry. Not all action types have an entry to go along with them, but those that do, Create
and Update
, are called entry creation actions and are the main source of data in an application.
It’s generally most useful to think about a record (entry plus creation action) as the primary unit of data. This is because the action holds useful context about when an entry was written and by whom. One entry written by two actions is considered to be the same piece of content, but when paired with their respective actions into records, each record is guaranteed to be unique.
Entries and actions are both addressable content, which means that they’re retrieved by their address — which is usually the hash of their data. All addresses are 32-byte identifiers.
There are four types of addressable content:
- An entry is an arbitrary blob of bytes, and its address is the hash of that blob. It has an entry type, which your application uses to deserialize it, validate it, and give it meaning.
- An agent ID is the public key of a participant in an application. Its address is the same as its content.
- An action stores action data for a record, and its address is the hash of the serialized action content.
- An external reference is the ID of a resource that exists outside the database, such as the hash of an IPFS resource or the public key of an Ethereum address. There’s no content stored at the address; it simply serves as an anchor to attach links to.
Storage locations
Addressable content can either be:
- Private, stored on the author’s device in their source chain, or
- Public, stored in the application’s shared graph database and accessible to all participants.
All actions are public, while entries can be either public or private. External references hold neither public nor private content, but merely point to content outside the database.
Shared graph database
Shared data in a Holochain application is represented as a graph database of nodes connected by edges called links. Any kind of addressable content can be used as a graph node.
This database is stored in a distributed hash table (DHT), which is a key/value store whose data is distributed throughout the network. In a Holochain application, the users themselves become network participants and help keep the DHT alive by storing a portion of the hash table.
Links
A link is a piece of metadata attached to an address, the base, and points to another address, the target. It has a link type that gives it meaning in the application just like an entry type, as well as an optional tag that can store arbitrary application data.
When a link’s base and target don’t exist as addressable content in the database, they’re considered external references whose data isn’t accessible to your application’s back end.
CRUD metadata graph
Holochain has a built-in create, read, update, and delete (CRUD) model. Data in the graph database and participants’ local state cannot be modified or deleted, so these kinds of mutation are simulated by attaching metadata to existing data that marks changes to its status. This builds up a graph of the history of a given piece of content and its links. We’ll get deeper into this in the next section and in the page on entries.
Individual state histories as public records
All data in an application’s database ultimately comes from the peers who participate in storing and serving it. Each piece of data originates in a participant’s source chain, which is an event journal that contains all the actions they’ve authored. These actions describe intentions to add to either the DHT’s state or their own state.
Every action becomes part of the shared DHT, but not every entry needs to. The entry content of most system-level actions is private. You can also mark an application entry type as private, and its content will stay on the participant’s device and not get published to the graph.
Because every action has a reference to both its author and its previous action in the author’s source chain, each participant’s source chain can be considered a linear graph of their authoring history.
Adding and modifying data
Because data can’t be deleted, mutation is simulated by adding metadata that describes changes to existing data’s state. The current state of an entry or record is calculated using simple rules, but you can also access the underlying metadata and implement your own CRUD model.
Every change starts out as an action on someone’s source chain. This action is turned into DHT operations that get sent to various peers, validated, and integrated into their portions of the database. DHT operations are beyond the scope of this page, so let’s focus on the result of integrating these operations.
Actions as both content and author history
In addition to the changes described below, every action:
- is stored as content at the action’s address.
- is stored as metadata at the author’s agent ID address, which allows peers responsible for that address to collect and validate a summary of their entire history.
Create
- The entry is stored as content at the entry’s address, if it doesn’t already exist.
- The action is stored as metadata at the entry’s address.
Update
does the same asCreate
, but also:- The action is added as metadata to the addresses of the original entry and its entry creation action. These serve as pointers from the original content to their replacements.
Delete
- The action is stored as metadata on the entry and entry creation action that it deletes, indicating their deleted status. The action contains all the information, so there’s no entry content.
CreateLink
- The action is stored as metadata on the link’s base address. The action contains all the link information.
DeleteLink
- The action is stored as metadata at the base address of the link it deletes, indicating that the link has been deleted.
- The action is stored as metadata at the action address of the link it deletes as well.
Private entries are also manipulated via Create
, Update
, and Delete
actions, but only the action gets published to the graph, as the entry content and action content are separate parts of the record.
Default CRUD rules
The built-in CRUD model is simplistic, collecting all the metadata on an entry or record and producing a final state using these rules:
- Although an entry can have multiple creation actions attached to it as metadata, the record returned contains the oldest-timestamped entry creation action that doesn’t have a corresponding delete action.
- There’s no built-in logic for updates, which means that multiple updates can exist on one entry creation action. This creates a branching update model similar to Git and leaves room for you to create your own conflict resolution mechanisms if you need them. Updates aren’t retrieved by default; you must retrieve them by asking for an address’ metadata.
- A delete applies to an entry creation action, not an entry. An entry is considered live until all its creation actions are deleted, at which point it’s fully dead and isn’t retrieved when asked for. A dead entry is live once again if a new entry creation action authors it.
- Unlike entries, links are completely contained in the action, and are always distinct from each other, even if their base, target, type, and tag are identical. There’s no link update action, and a link deletion action marks one link creation action as dead.
If these rules don’t work for you, you can always directly access the underlying metadata and implement your own CRUD model.
Privacy
Each DNA within a Holochain application has its own network and database, isolated from all other DNAs’ networks and their databases. For each participant in a DNA, their source chain is separate from the source chains of all other DNAs they participate in, and the source chains of all other participants in the same DNA. Within a DNA, all shared data can be accessed by any participant, but the only one who can access a participant’s private entries is themselves.
A DNA can be cloned, creating a separate network, database, and set of source chains for all participants who join it. This lets you use the same backend code to define private spaces within one application to restrict access to certain shared databases.
Summary: multiple interrelated graphs
The shared DHT and the individual source chains are involved in multiple interrelated graphs — the source chain contributes to the DHT’s graph, and the DHT records source chain history. You can use as little or as much of these graphs as your application needs.
Further reading
- Core Concepts: Source Chain
- Core Concepts: DHT
- Core Concepts: Links and Anchors
- Core Concepts: CRUD actions
- Wikipedia: Graph database
- Wikipedia: Distributed hash table
- Martin Fowler on Event Sourcing, the pattern that both source chains and blockchains use to record state changes.
Reference
holochain_integrity_types::record::Record
holochain_integrity_types::action::Action
holochain_integrity_types::prelude::Create
holochain_integrity_types::prelude::Update
holochain_integrity_types::prelude::Delete
holochain_integrity_types::prelude::CreateLink
holochain_integrity_types::prelude::DeleteLink
In this section
- Entries — creating, reading, updating, and deleting