Skip to main content
Saved
Guide
Category strategy
Difficulty Intermediate

Choosing a Networking Style for Realtime Apps

Decision framework for selecting between polling, Server-Sent Events, WebSockets, and CRDTs based on your application's realtime and collaboration requirements.

Den Odell
By Den Odell Added

Choosing a Networking Style for Realtime Apps

Realtime functionality spans a wide spectrum, from dashboards that refresh every thirty seconds to collaborative editors where every keystroke appears instantly on a colleague’s screen. The networking approach you choose shapes the user experience, infrastructure complexity, and operational costs of your application for years to come.

The question is not which technology is best in the abstract. WebSockets are not inherently superior to polling, and CRDTs are not always the right choice for collaboration. The question is which approach fits your specific requirements, constraints, and team capabilities.

Over the years, I have built notification systems that needed sub-second delivery, collaborative tools where multiple users edited the same document simultaneously, and dashboards that displayed metrics from distributed systems. Each project taught me that the right networking choice depends on understanding what “realtime” actually means for your users and being honest about the complexity you can afford to maintain.

Clarify What Realtime Means for Your Application

Before evaluating technologies, I find it helpful to characterize the realtime behavior your application actually needs. Different use cases have fundamentally different requirements.

Periodic freshness means users need reasonably current data, but delays of seconds or even minutes are acceptable. Stock portfolio summaries, social media feeds, and analytics dashboards often fall into this category. Users expect data to update, but they don’t expect instantaneous synchronization.

Live updates means users should see changes shortly after they occur, typically within a second or two. Notification systems, live sports scores, auction bidding interfaces, and order tracking fall here. Users notice delays beyond a few seconds, but brief latency is acceptable.

Realtime collaboration means multiple users interact with the same data simultaneously and expect to see each other’s changes as they happen. Collaborative document editors, shared whiteboards, multiplayer games, and live coding environments require this level of synchronization. Latency of even a few hundred milliseconds can feel sluggish.

Offline-capable collaboration adds another dimension: users need to continue working when disconnected and have their changes merge correctly when connectivity returns. Field service applications, note-taking tools, and mobile-first collaborative apps often need this capability.

Misidentifying your requirements leads to either over-engineering, where you build WebSocket infrastructure for a dashboard that would work fine with polling, or under-engineering, where you build polling for a collaborative editor and users complain about lag and lost work. Getting this classification right early saves considerable pain later.

Understand the Core Options

Polling

Polling is the simplest approach: your application makes HTTP requests at regular intervals to fetch current data. The server responds with the latest state, and your client updates the UI accordingly.

Choose polling when:

  • Data changes infrequently, perhaps every few minutes or hours
  • Latency of several seconds is acceptable to users
  • You want the simplest possible implementation
  • Your infrastructure doesn’t support persistent connections easily

What to watch for: Polling wastes resources when data hasn’t changed. If you poll every five seconds but data only changes once per hour, 99.9% of your requests return identical responses. The cost scales linearly with user count regardless of actual update frequency.

Improving efficiency: Long polling holds connections open until data changes or a timeout occurs, reducing wasted requests. The Stale-While-Revalidate pattern can show cached data immediately while fetching updates in the background, improving perceived responsiveness even with polling.

I recommend polling as the default starting point for any realtime requirement. If polling works well enough, you avoid the complexity of persistent connections entirely. Only escalate to more sophisticated approaches when polling creates measurable problems.

Server-Sent Events

Server-Sent Events (SSE) establish a persistent HTTP connection where the server can push data to the client whenever updates occur. Unlike polling, the server initiates communication when something changes rather than waiting for the client to ask.

Choose SSE when:

  • Updates flow primarily from server to client
  • You need lower latency than polling provides
  • You want simpler infrastructure than WebSockets require
  • Your updates work well as a stream of discrete events

What to watch for: SSE only supports server-to-client communication. If your application needs to send frequent messages from client to server, you’ll need separate HTTP requests for that direction, which may negate SSE’s simplicity benefits.

Infrastructure considerations: SSE works over standard HTTP, which means it flows through existing proxies, load balancers, and CDNs without special configuration. This is a significant operational advantage over WebSockets in many environments. However, some older proxies may buffer SSE responses or impose connection timeouts that require workarounds.

SSE hits a sweet spot for many applications: it provides push-based updates with substantially less complexity than WebSockets. Notification feeds, live dashboards, and progress indicators for long-running operations are excellent fits for SSE.

WebSockets

WebSockets establish persistent, bidirectional connections between client and server. Either side can send messages at any time without the overhead of HTTP request/response cycles.

Choose WebSockets when:

  • You need bidirectional communication with low latency
  • Messages flow frequently in both directions
  • Sub-second latency matters to users
  • You’re building interactive features like chat, gaming, or live collaboration

What to watch for: WebSockets require more infrastructure investment. You need WebSocket-capable servers, load balancers configured for persistent connections, and strategies for handling connection lifecycle events. Connections can drop unexpectedly, requiring reconnection logic with backoff strategies.

Scaling considerations: Each WebSocket connection consumes server resources continuously, even when idle. A server that handles thousands of HTTP requests per second might only handle tens of thousands of concurrent WebSocket connections. Horizontal scaling requires sticky sessions or a message broker to route messages to the correct server.

WebSockets are the right choice when you genuinely need bidirectional, low-latency communication. But I have seen teams adopt WebSockets for applications that would work fine with SSE or even polling, only to struggle with operational complexity that wasn’t necessary.

CRDTs and Operational Transforms

For applications where multiple users edit the same data simultaneously, you need strategies to handle concurrent modifications. Two approaches dominate this space: Operational Transformation (OT) and Conflict-free Replicated Data Types (CRDTs).

Operational Transformation represents changes as operations (insert character at position 5, delete characters 10-15) and transforms operations against each other to account for concurrent edits. Google Docs pioneered this approach. OT typically requires a central server to sequence operations, which simplifies conflict resolution but creates a single point of failure and latency dependency.

CRDTs are data structures mathematically designed to merge concurrent changes without conflicts. Each client can apply changes locally and sync with other clients asynchronously, with guaranteed eventual consistency. CRDTs work well for offline-capable applications because changes merge correctly regardless of when or in what order they arrive.

Choose CRDTs or OT when:

  • Multiple users edit the same document or data structure simultaneously
  • Users expect to see each other’s changes in near real-time
  • Merge conflicts must be resolved automatically without user intervention
  • You need character-by-character or fine-grained synchronization

What to watch for: Both approaches add significant complexity. Implementing OT correctly is notoriously difficult; subtle bugs can cause documents to diverge in ways that are hard to detect and repair. CRDTs are easier to reason about mathematically but can produce unintuitive merge results and may grow in size over time as they track history.

For collaborative text editing, I recommend using established libraries like Yjs, Automerge, or ShareDB rather than implementing these algorithms yourself. The edge cases are numerous and the debugging is painful.

Choose a Conflict Handling Strategy

When multiple users can edit the same data, conflicts are inevitable. Even with the best networking infrastructure, two users will eventually make changes at the same time. How you handle these conflicts shapes the user experience as much as the networking technology itself.

There are three fundamental strategies for handling conflicts, and understanding when to use each prevents both data loss and unnecessary complexity.

Last-Write-Wins

The simplest conflict resolution strategy is to accept the most recent change and discard earlier concurrent changes. The server timestamps each update, and when conflicts occur, the update with the latest timestamp prevails.

Choose last-write-wins when:

  • The data is low-stakes and easily recreated
  • Changes are infrequent, making conflicts rare
  • Users are unlikely to edit the same record simultaneously
  • You need the simplest possible implementation

What to watch for: Last-write-wins silently discards work. If two users spend ten minutes editing the same document and submit at nearly the same time, one user loses their changes entirely. For user-generated content that took effort to create, this feels like a bug even when it’s working as designed.

Improving the experience: When rejecting a change due to a newer version, return the current content so the client can show what changed. This lets users manually reconcile their work rather than discovering their changes vanished.

Last-write-wins works surprisingly well for many applications. User profile settings, application preferences, and metadata fields rarely see true concurrent edits. The simplicity of the approach outweighs its limitations when conflicts are genuinely rare.

Show Conflicts for User Resolution

When data is valuable enough that silent overwrites are unacceptable, you can detect conflicts and ask users to resolve them manually. This approach preserves all versions and lets humans make decisions about how to merge changes.

When a client submits changes based on an outdated version, the server rejects the update and returns both versions. The client displays a comparison interface where the user can choose which changes to keep, merge them manually, or keep both as separate versions.

Choose conflict dialogs when:

  • The data is valuable and took significant effort to create
  • Users would be upset to lose their changes silently
  • Automatic merging would produce confusing results
  • The content is complex enough that only humans can merge it sensibly

What to watch for: Conflict dialogs interrupt the user’s workflow. If conflicts are frequent, users will become frustrated with constant interruptions. The dialog must be clear about what happened and what the options mean; vague messages like “a conflict occurred” leave users confused about what to do.

Design considerations: Show a meaningful diff when possible. For text, highlight what changed between versions. For structured data, explain which fields differ. Give users an option to keep both versions if the content is important enough to preserve.

Git’s merge conflict interface is the canonical example of this pattern. It works because conflicts are relatively rare and the content (code) is valuable enough that automatic resolution would be dangerous. If your data has similar characteristics, this approach makes sense.

Local Lock While Editing

Rather than resolving conflicts after they happen, you can prevent them by ensuring only one user can edit a resource at a time. When a user begins editing, they acquire a lock that prevents others from making changes until they’re done.

Before a user enters edit mode, the client requests a lock from the server. If granted, other users see the document as read-only or see a message that someone else is editing. When the editing user saves or abandons their changes, the lock is released.

Choose locking when:

  • Edits take significant time (minutes rather than seconds)
  • The editing experience doesn’t work well with interruptions
  • Users expect exclusive access while working
  • Merge conflicts would be difficult or impossible to resolve

What to watch for: Locks can become stale if users abandon sessions without releasing them. Always implement automatic expiration. Show who holds the lock and when it expires so waiting users have expectations. Consider “break lock” functionality for administrators when locks become stuck.

Improving the experience: Use heartbeats to extend locks while users are actively editing. This prevents premature expiration while ensuring abandoned locks don’t persist indefinitely. Show real-time lock status to waiting users so they know when they can edit.

Locking feels old-fashioned compared to real-time collaboration, but it remains the right choice for many scenarios. Complex form editing, file uploads with processing steps, and workflows where partial changes would cause problems all benefit from the simplicity of exclusive access.

Choosing Between Conflict Strategies

The right conflict strategy depends on the nature of the data and how users interact with it.

Consider last-write-wins when edits are quick and independent. A user changing their display name doesn’t need to worry about conflicts with someone else changing email notification preferences. Settings, preferences, and metadata where each field is conceptually independent work well with last-write-wins on a per-field basis.

Consider user-resolved conflicts when the data represents significant user investment. If users spend time crafting content, they deserve to see and control what happens when conflicts occur. Long-form text, complex configurations, and anything with business significance benefits from human oversight during conflict resolution.

Consider locking when the editing experience is immersive or time-consuming. If users enter a focused editing mode for extended periods, interrupting them with conflict dialogs would be jarring. Document formatting, visual editors, and multi-step wizards often work better with locks that prevent conflicts entirely.

You can combine strategies within the same application. A project management tool might use last-write-wins for task status changes, conflict dialogs for task descriptions, and locking for the visual project timeline editor. Match the strategy to the characteristics of each type of data.

Data CharacteristicsRecommended StrategyExample
Quick edits, low stakes, rarely concurrentLast-write-winsUser settings, status fields
Valuable content, users would notice lost changesUser resolutionComments, descriptions, documents
Immersive editing, extended sessionsLockingForm builders, visual editors
Real-time collaboration expectedCRDTs/OTShared documents, whiteboards

Whatever strategy you choose, communicate clearly to users. If their changes might be discarded, warn them. If someone else is editing, show who. If a conflict occurred, explain what happened and what their options are. The worst conflict experiences happen when users don’t understand why their work disappeared.

Factor in Offline Requirements

The need to work offline fundamentally changes the architecture of realtime applications.

If your application must function when disconnected, you need local storage for pending changes, conflict resolution strategies for when connectivity returns, and UI that clearly communicates sync status to users. This typically points toward CRDTs or similar eventually-consistent approaches, since they’re designed for exactly this scenario.

If offline functionality isn’t required, you have more flexibility. Server-authoritative approaches are simpler because you don’t need to reconcile divergent states. The server always has the canonical truth, and clients simply reflect what the server tells them.

I have seen teams add offline capability as an afterthought and struggle enormously. If offline matters for your application, design for it from the beginning. If it doesn’t matter, don’t add complexity to support a requirement you don’t have.

Consider the Failure Modes

Each networking approach fails differently, and understanding these failure modes helps you build appropriate resilience.

Polling failures are straightforward: requests fail, you retry, and eventually succeed or give up. The user experience degrades gracefully to stale data. This simplicity is an underrated advantage.

SSE failures interrupt the event stream. The browser’s EventSource API automatically reconnects, but you may miss events during the disconnection. You need strategies to recover missed events, often by including a last-event-id that the server uses to replay what was missed.

WebSocket failures can be subtle. Connections may appear open while actually being broken, a situation called a “zombie connection.” Heartbeat mechanisms detect this, but implementation varies. When connections drop, you need reconnection logic with exponential backoff to avoid overwhelming servers during outages.

CRDT failures are different in character. Individual operations always succeed locally; the question is whether and when they synchronize with other clients. Partitioned clients may diverge for extended periods. You need to communicate sync status clearly so users understand whether their changes have propagated.

Build for the failure mode that matches your users’ tolerance. For a casual social feature, showing stale data briefly is fine. For a collaborative document where users might lose work, you need robust handling with clear feedback.

Evaluate Your Infrastructure Constraints

Your existing infrastructure may constrain your choices more than you’d like.

Serverless environments like AWS Lambda or Vercel Functions handle HTTP requests naturally but don’t support persistent connections. If you’re committed to serverless, you’ll need managed services for WebSocket support or accept that polling or short-lived SSE connections are your options.

Corporate networks sometimes block WebSocket connections or impose aggressive proxy timeouts on long-lived HTTP connections. If your users are primarily in enterprise environments, test your approach against realistic network conditions.

Mobile networks are unreliable and latency-variable. Persistent connections will drop frequently. Any approach you choose needs graceful reconnection, and you should consider how users perceive the experience when connectivity degrades.

Global distribution complicates realtime architectures. If your users are worldwide but your servers are in one region, latency becomes a factor. Polling is relatively tolerant of latency; interactive collaboration suffers noticeably.

I recommend testing your chosen approach under realistic conditions early in development. Discovering that your corporate users can’t establish WebSocket connections after you’ve built your entire architecture around them is an unpleasant experience.

Match Complexity to Team Capability

The sophistication of your networking approach should match your team’s ability to build, debug, and operate it.

Polling requires skills your team already has: making HTTP requests and managing state. There’s nothing special to learn, nothing new to debug.

SSE is a modest step up. The EventSource API is straightforward, but you need to handle reconnection and missed events. Server-side implementation is simple if you’ve worked with streaming responses before.

WebSockets require understanding connection lifecycle, implementing heartbeats, handling reconnection, and potentially operating message brokers for horizontal scaling. Teams new to WebSockets often underestimate the operational complexity.

CRDTs and OT are at the far end of the complexity spectrum. Even using existing libraries, you need to understand the data structures well enough to debug synchronization issues. Building these from scratch is a multi-month endeavor for experienced teams.

I advise starting simpler than you think you need. You can always add complexity when requirements demand it. Removing complexity is much harder than adding it.

Quick Decision Guide

Your SituationRecommended ApproachReason
Data changes every few minutes, brief staleness is acceptablePollingSimplest implementation, easy to debug
Server pushes updates, client rarely sends dataServer-Sent EventsPush-based without WebSocket complexity
Frequent bidirectional messages, low latency requiredWebSocketsTrue bidirectional communication
Multiple users editing the same content simultaneouslyCRDTs or OT (via library)Handles concurrent edits correctly
Must work offline with eventual syncCRDTsDesigned for offline-first scenarios
Corporate/restricted network environmentsPolling or SSEBetter proxy and firewall compatibility

Combining Approaches

Real applications often combine multiple approaches for different features.

A project management tool might use polling for the main dashboard, SSE for notifications, and WebSockets with CRDTs for collaborative document editing within tasks. Each feature gets the networking approach appropriate to its requirements rather than forcing everything through a single solution.

When combining approaches, be deliberate about which data flows through which channel. Accidental duplication, where the same update arrives via both polling and WebSockets, creates subtle bugs. Clear separation of concerns helps: notifications through one channel, document state through another, presence information through a third.

Signals That Your Approach Needs to Change

Watch for specific symptoms that indicate your current approach has been outgrown:

  • Users complain about latency when polling intervals are as short as you can afford. This suggests moving to push-based approaches.
  • Server costs scale painfully with user count even when data changes infrequently. Consider push-based approaches to reduce unnecessary requests.
  • Users lose work due to concurrent edits overwriting each other. This requires proper Conflict Resolution strategies, potentially CRDTs or OT.
  • Connection management dominates your debugging time. If WebSocket complexity is causing more problems than it solves, consider whether SSE or even polling would suffice.
  • Offline users can’t work effectively. If offline capability has become important, you may need to adopt CRDTs or similar eventually-consistent approaches.

The key is responding to actual problems rather than anticipated ones. Escalate complexity when your current approach creates friction you’re experiencing now.

Decision Checklist

Before committing to an approach, work through these questions:

What latency do users actually need? Distinguish between what users say they want and what actually affects their experience. Test with realistic delays before assuming you need sub-second updates.

Which direction does data flow? If updates primarily flow from server to client, SSE may be simpler than WebSockets. If communication is truly bidirectional, WebSockets become more attractive.

How do you handle concurrent edits? If multiple users can modify the same data, you need a conflict strategy. Optimistic Updates with server reconciliation work for many cases; true collaboration needs CRDTs or OT.

What happens when users go offline? If they need to keep working, you need local storage and merge strategies. If not, server-authoritative approaches are simpler.

Can your infrastructure support it? Verify that your hosting, proxies, and network conditions support your chosen approach before building around it.

Can your team operate it? Be honest about your team’s experience with the approach you’re considering. Simpler solutions you can debug are better than sophisticated solutions you can’t.

What’s your migration path? Encapsulate your networking layer behind clean interfaces so you can change approaches later if requirements evolve.

The best networking approach is the simplest one that meets your actual requirements. Start there, and add complexity only when you have evidence that it’s necessary.

Newsletter

A Monthly Email
from Den Odell

Behind-the-scenes thinking on frontend patterns, site updates, and more

No spam. Unsubscribe anytime.