Node.js Data Flow

Streams & Buffers

Process massive amounts of data piece by piece. Memory-efficient, time-efficient, and incredibly powerful.

πŸ“„
Source
Large File
↓
64KB
64KB
64KB
↓
🎯
Destination
Processed

Why Streams Matter

Traditional methods load everything into memory. Streams process data as it flows, enabling you to work with files of any size.

🧠

Memory Efficient

Process multi-GB files with constant memory usage. No more "JavaScript heap out of memory" errors.

⚑

Time Efficient

Start processing immediately. Don't wait for the entire file to load before beginning work.

πŸ”—

Composable

Chain streams together with pipes. Read β†’ Compress β†’ Encrypt β†’ Write in one flowing pipeline.

πŸ“¦

Handle Large Data

Videos, logs, databases, network responses. If it's big, streams handle it gracefully.

Four Types of Streams

Node.js provides four fundamental stream types, each designed for specific data flow patterns. Understanding these is key to mastering data processing.

πŸ’‘ Water Pipe Analogy

Think of streams like plumbing. Water (data) flows continuously through pipes. You don't need to fill a bucket firstβ€”you process water as it flows, and you can connect multiple pipes together.

TYPE 01 πŸ“–

Readable

Data sources you can read from. The starting point of your data flow.

const readStream = fs.createReadStream('file.txt');

readStream.on('data', (chunk) => {
  console.log(chunk);
});
β€’ fs.createReadStream()
β€’ http.IncomingMessage
β€’ process.stdin
TYPE 02 ✍️

Writable

Destinations you can write to. The endpoint where data lands.

const writeStream = fs.createWriteStream('out.txt');

writeStream.write('Hello');
writeStream.end('World');
β€’ fs.createWriteStream()
β€’ http.ServerResponse
β€’ process.stdout
TYPE 03 πŸ”„

Duplex

Both readable and writable. Data flows both ways simultaneously.

// TCP sockets are duplex
const socket = net.createConnection(port);

socket.write('request'); // writable
socket.on('data', ...);     // readable
β€’ TCP sockets
β€’ zlib streams
β€’ crypto streams
TYPE 04 ⚑

Transform

Modify data as it passes through. The powerhouses of stream processing.

const { Transform } = require('stream');

const upperCase = new Transform({
  transform(chunk, enc, cb) {
    this.push(chunk.toString().toUpperCase());
    cb();
  }
});
β€’ zlib.createGzip()
β€’ crypto.createCipher()
β€’ Custom processors

The Power of Piping

Connect streams together to create powerful data processing pipelines. Data flows from source to destination, transformed along the way.

πŸ“–
Read
⚑
Transform
✍️
Write

Basic Pipe

// Simple file copy
fs.createReadStream('input.txt')
  .pipe(fs.createWriteStream('output.txt'));

Compression Pipeline

const zlib = require('zlib');

// Read β†’ Compress β†’ Write
fs.createReadStream('file.txt')
  .pipe(zlib.createGzip())
  .pipe(fs.createWriteStream('file.txt.gz'));

Multi-Transform Chain

const { Transform } = require('stream');

// Custom transforms
const upperCase = new Transform({
  transform(chunk, enc, cb) {
    this.push(chunk.toString().toUpperCase());
    cb();
  }
});

const removeSpaces = new Transform({
  transform(chunk, enc, cb) {
    this.push(chunk.toString().replace(/\s/g, ''));
    cb();
  }
});

// Chain them together
fs.createReadStream('data.txt')
  .pipe(upperCase)
  .pipe(removeSpaces)
  .pipe(fs.createWriteStream('result.txt'));

Pro Tip

.pipe() handles backpressure automatically! It pauses the readable when the writable is overwhelmed, preventing memory issues.

Stream Events

πŸ“– Readable Events

'data'

Chunk of data is available

Most Common
'end'

No more data to read

'error'

Error occurred

Always Handle
'readable'

Data ready to be read

'close'

Stream closed

readStream.on('data', (chunk) => {
  console.log(`Received ${chunk.length} bytes`);
});

readStream.on('end', () => {
  console.log('Stream ended');
});

readStream.on('error', (err) => {
  console.error('Error:', err);
});

✍️ Writable Events

'drain'

Safe to write more data

Backpressure
'finish'

All data has been flushed

'error'

Error occurred

'pipe'

Readable stream piped to this

'unpipe'

Stream unpiped

writeStream.on('finish', () => {
  console.log('All data written');
});

writeStream.on('drain', () => {
  console.log('Buffer cleared, resume writing');
});

With vs Without Streams

❌

Without Streams

Loads entire file into memory. Crashes with large files!

fs.readFile('large-video.mp4', (err, data) => {
  if (err) throw err;
  
  // πŸ’₯ Memory explosion!
  // 2GB file = 2GB RAM used
  
  fs.writeFile('copy.mp4', data, (err) => {
    if (err) throw err;
    console.log('Copied!');
  });
});
High memory usage
Slow start (wait for full load)
Crashes on large files
βœ…

With Streams

Processes in small chunks. Constant memory usage regardless of file size.

const readStream = fs.createReadStream('large-video.mp4');
const writeStream = fs.createWriteStream('copy.mp4');

// ✨ Constant memory (~64KB buffer)
readStream.pipe(writeStream);

writeStream.on('finish', () => {
  console.log('Copied!');
});
Low memory footprint
Immediate processing
Handles any file size

Understanding Backpressure

When the readable stream produces data faster than the writable stream can consume it, memory fills up and your application crashes. Here's how to handle it.

The Problem: Manual Writing

readStream.on('data', (chunk) => {
  // write() returns false when buffer is full
  const canContinue = writeStream.write(chunk);
  
  if (!canContinue) {
    console.log('Backpressure! Pausing...');
    readStream.pause(); // Stop reading
  }
});

writeStream.on('drain', () => {
  console.log('Drained! Resuming...');
  readStream.resume(); // Safe to continue
});
⚠️

Memory Danger Zone

Without proper backpressure handling, your application will accumulate data in memory until it crashes with "JavaScript heap out of memory".

The Solution: Use .pipe()

// ✨ .pipe() handles backpressure automatically!
// It pauses when needed, resumes when ready

fs.createReadStream('huge-file.txt')
  .pipe(fs.createWriteStream('output.txt'));

// That's it. No manual pause/resume needed.
// The stream manages flow control for you.
πŸ›‘

Auto Pause

Pauses reading when write buffer is full

▢️

Auto Resume

Resumes when drain event fires

πŸ›‘οΈ

Protected

Memory stays constant regardless of speed difference

Quick Reference

createReadStream() Start reading
createWriteStream() Start writing
.pipe() Connect streams
.on('data') Handle chunks
highWaterMark Buffer size (default 64KB)
.pause() Stop reading
.resume() Continue reading
.on('error') Handle errors