Intro to yjs

Written by Basile Samel.

Published Jul 9, 2025. Last updated Jul 9, 2025.

yjs is a Conflict-Free Replicated Data Type (CRDT) library for Javascript.

I think it’s one of the most underrated pieces of open-source software I’ve ever worked with, so I decided to write a series of tutorials to help more developers use it productively.

First, let’s start with the basics.

Why Yjs Exists

Yjs was created to address the need for real-time, conflict-free data synchronization across multiple users and devices. Traditional approaches to collaboration, like operational transformation (OT), can be complex and require centralized coordination. CRDTs offer a more distributed and resilient solution.

Yjs brings the power of CRDTs to Javascript to make it easier to build collaborative / multiplayer apps like Google Docs, Figma, or Notion without having to reinvent the wheel.

It’s also a building block of offline-first web applications to create more resilient user experiences independently of network conditions.

The consequences for web development are far-reaching, but I’ll keep this tirade for another article and dive straight into the meaty part.

What’s A CRDT

CRDT stands for Conflict-Free Replicated Data Type. It’s a type of data structure that automatically resolves conflicts to ensure consistency across replicas even if updates are made concurrently, out of order (commutative), or multiple times (idempotent).

CRDTs achieve this by defining deterministic merge operations: if two users update a document at the same time, CRDTs guarantee that all replicas will eventually converge to the same state, without needing manual intervention or a central server to resolve conflicts.

Because of these properties, CRDTs are great for building distributed web systems on unreliable networks.

Basic Data Structures

Yjs provides several primary data types:

Y.Doc - The root container holding all shared data and managing updates.
Y.Map - A key-value store for structured data (similar to a Javascript Map or object).
Y.Array - An ordered collection for working with sequences of items (like a Javascript array).
Y.Text - A collaborative text type for building editors.

These types can be composed together to build complex collaborative documents:

import * as Y from 'yjs'

const ydoc = new Y.Doc()

ydoc.getMap('root').set('array', new Y.Array()) 

ydoc.getMap('root').get('array').push(['ok'])

All of these data structures are CRDTs so you can apply changes to them independently across clients, and they will merge without conflict.

Import / Export Updates

Yjs operates on binary updates: whenever a user modifies a shared structure, yjs generates a compact binary update that represents the change. These updates can be serialized and transmitted over any transport mechanism.

Updates can be imported/exported using two simple methods:

// export
const update = Y.encodeStateAsUpdate(doc);

// import
Y.applyUpdate(remoteDoc, update);

Networking

There are 3 main ways to share updates between clients:

Strategy 1: HTTP + Server-Sent Events (SSE)

Use HTTP requests to send / fetch updates to a central server and server-sent events to get notificed of updates.

Pros:

Simple to implement and debug.
Can leverage all http features including cookies for auth

Cons:

Not bi-directional.
Higher latency.
SSE can’t handle binary data so you need an additional HTTP request to fetch binary updates.

This is my go-to strategy for 90% of use cases where near real-time is good enough (e.g not for competitive online games).

Strategy 2: WebSockets

Well-documented solution using good-old websockets:

import * as Y from 'yjs'

import { WebsocketProvider } from 'y-websocket'

// 1. create a new empty document

const ydoc = new Y.Doc()

// 2. connect document to websocket

const websocketProvider = new WebsocketProvider(
  'wss://demos.yjs.dev', 'count-demo', ydoc
)

// 3. the changes are automatically shared via websocket using the room id

const yarray = ydoc.getArray('count')

yarray.push([1])

Pros:

Full-duplex real-time communication
Low latency
Well-supported

Cons:

Requires a stateful server (centralized architecture)
More complex to scale and need to handle firewalls
Can’t send custom HTTP headers so auth feels hacky
Need to implement a reconnection workflow

I’m not fond of this solution because it’s missing many features like auto-reconnection or proper auth, but it’s a good way to scale concurrent edits on a single document. I’m also not a big fan of having a websocket server running 24/7 because cloud computing isn’t cheap and memory leaks are commonplace: all it takes is one bad library to crash everything.

3. WebRTC

Pros:

Peer-to-peer = no central server required after initial signaling
Blazingly fast
Modern features
Doesn’t require a strong web server to scale

Cons:

You need a signaling server
Not ideal for large groups since each participant needs to connect to other participants one by one.

I like to use WebRTC to send awareness / presence data like cursor positions or form changes. It’s a good way to decrease server costs if the room is small (20-35 participants) but a full-mesh topology gets complex fast and you’ll need to engineer something to scale beyond that so might as well use websockets.

Data Storage Layer

Yjs uses a binary format to store, transfer, and persist document updates efficiently. This binary format is compact and optimized for performance: Yjs documents are typically much smaller than equivalent JSON.
According to open-source benchmarks, yjs achieves good performance even for huge documents with millions of characters.
Yjs is designed for structured data, not blobs, so it’s not suitable for storing files.
Persistence is handled using any database that supports binary blobs. The official documentation features examples including LevelDB, IndexedDB, or Postgres:

import * as Y from 'yjs'

import { IndexeddbPersistence } from 'y-indexeddb'

// 1. create new document

const ydoc = new Y.Doc()

// 2. Auto-sync yjs document with indexeddb document

const idb = new IndexeddbPersistence('count-demo', ydoc)

idb.whenSynced.then(() => {
    console.log('loaded data from indexed db')
})

// 3. Make changes

const yarray = ydoc.getArray('count')
yarray.push([1])

// 4. Test the changes are persisted

var ydoc2 = new Y.Doc()

const idb2 = new IndexeddbPersistence('count-demo', ydoc2)

idb2.whenSynced.then(() => {
    console.log(ydoc2.getArray('count').toJSON()) // [1]
})