Working With Data

Holochain is, at its most basic, a framework for building graph databases on top of content-addressed storage that are validated and stored by networks of peers. Each peer contributes to the state of this database by publishing actions to an event journal stored on their device called their source chain. The source chain can also be used to hold private state.

Entries, actions, and records: primary data

Data in Holochain takes the shape of a record. Different kinds of records have different purposes, but the thing common to all records is the action: one participant’s intent to manipulate their own state or the application’s shared database state in some way. All actions contain:

Some actions also contain a weight, which is a calculation of the cost of storing the action and can be used for rate limiting. (Note: weighting and rate limiting isn’t implemented yet.)

The other important part of a record is the entry. Not all action types have an entry to go along with them, but those that do, Create and Update, are called entry creation actions and are the main source of data in an application.

It’s generally most useful to think about a record (entry plus creation action) as the primary unit of data. This is because the action holds useful context about when an entry was written and by whom. One entry written by two actions is considered to be the same piece of content, but when paired with their respective actions into records, each record is guaranteed to be unique.

authors
authors
authors
creates
creates
creates
Alice
Action 1
Bob
Action 2
Carol
Action 3
Entry

Entries and actions are both addressable content, which means that they’re retrieved by their address — which is usually the hash of their data. All addresses are 32-byte identifiers.

There are four types of addressable content:

Storage locations

Addressable content can either be:

All actions are public, while entries can be either public or private. External references hold neither public nor private content, but merely point to content outside the database.

Shared graph database

Shared data in a Holochain application is represented as a graph database of nodes connected by edges called links. Any kind of addressable content can be used as a graph node.

This database is stored in a distributed hash table (DHT), which is a key/value store whose data is distributed throughout the network. In a Holochain application, the users themselves become network participants and help keep the DHT alive by storing a portion of the hash table.

A link is a piece of metadata attached to an address, the base, and points to another address, the target. It has a link type that gives it meaning in the application just like an entry type, as well as an optional tag that can store arbitrary application data.

type: artist_album

type: artist_album_by_release_date tag: 1966-01-17

type: artist_album

type: artist_album_by_release_date tag: 1970-01-26

Simon & Garfunkel
Sounds of Silence
Bridge over Troubled Water

When a link’s base and target don’t exist as addressable content in the database, they’re considered external references whose data isn’t accessible to your application’s back end.

type: eth_wallet_to_ipfs_profile_photo

hC8kafe9…7c12
hC8kd01f…84ce

CRUD metadata graph

Holochain has a built-in create, read, update, and delete (CRUD) model. Data in the graph database and participants’ local state cannot be modified or deleted, so these kinds of mutation are simulated by attaching metadata to existing data that marks changes to its status. This builds up a graph of the history of a given piece of content and its links. We’ll get deeper into this in the next section and in the page on entries.

Individual state histories as public records

All data in an application’s database ultimately comes from the peers who participate in storing and serving it. Each piece of data originates in a participant’s source chain, which is an event journal that contains all the actions they’ve authored. These actions describe intentions to add to either the DHT’s state or their own state.

Every action becomes part of the shared DHT, but not every entry needs to. The entry content of most system-level actions is private. You can also mark an application entry type as private, and its content will stay on the participant’s device and not get published to the graph.

Because every action has a reference to both its author and its previous action in the author’s source chain, each participant’s source chain can be considered a linear graph of their authoring history.

prev_action_hash
prev_action_hash
entry_hash
prev_action_hash
entry_hash
prev_action_hash
prev_action_hash
entry_hash
prev_action_hash
prev_action_hash
entry_hash
prev_action_hash
prev_action_hash
DNA
AgentValidationPkg
Create
AgentID
Create
Entry 1
InitZomesComplete
Create
Entry 2
Delete
Update
Entry 3
CreateLink
DeleteLink

Adding and modifying data

Because data can’t be deleted, mutation is simulated by adding metadata that describes changes to existing data’s state. The current state of an entry or record is calculated using simple rules, but you can also access the underlying metadata and implement your own CRUD model.

Every change starts out as an action on someone’s source chain. This action is turned into DHT operations that get sent to various peers, validated, and integrated into their portions of the database. DHT operations are beyond the scope of this page, so let’s focus on the result of integrating these operations.

Actions as both content and author history

In addition to the changes described below, every action:

  • is stored as content at the action’s address.
  • is stored as metadata at the author’s agent ID address, which allows peers responsible for that address to collect and validate a summary of their entire history.

Private entries are also manipulated via Create, Update, and Delete actions, but only the action gets published to the graph, as the entry content and action content are separate parts of the record.

Default CRUD rules

The built-in CRUD model is simplistic, collecting all the metadata on an entry or record and producing a final state using these rules:

If these rules don’t work for you, you can always directly access the underlying metadata and implement your own CRUD model.

Privacy

Each DNA within a Holochain application has its own network and database, isolated from all other DNAs’ networks and their databases. For each participant in a DNA, their source chain is separate from the source chains of all other DNAs they participate in, and the source chains of all other participants in the same DNA. Within a DNA, all shared data can be accessed by any participant, but the only one who can access a participant’s private entries is themselves.

A DNA can be cloned, creating a separate network, database, and set of source chains for all participants who join it. This lets you use the same backend code to define private spaces within one application to restrict access to certain shared databases.

Summary: multiple interrelated graphs

The shared DHT and the individual source chains are involved in multiple interrelated graphs — the source chain contributes to the DHT’s graph, and the DHT records source chain history. You can use as little or as much of these graphs as your application needs.

Further reading

Reference

In this section

It looks like you are using Internet Explorer. While the basic content is available, this is no longer a supported browser by the manufacturer, and no attention is being given to having IE work well here.