Charles Snow

Posted on Feb 9

How I Built a Hybrid (Browser‑First) File Converter — Fast, Private, and Framework‑Free

#webdev #javascript #webassembly #privacy

Most file conversion tools upload your files to a remote server, process them, and send them back. That means your data leaves your device — which can be a problem when you’re working with sensitive documents.

I wanted to explore a more privacy‑friendly approach: do as much as possible directly in the browser, and only fall back to server processing when the web platform can’t do the job well. This “Hybrid (Browser‑First)” model is what I used to build FastlyConvert — a multi‑format conversion & compression suite.

Website: https://www.fastlyconvert.com
Privacy policy: https://www.fastlyconvert.com/privacy

The Architecture: What Runs Where

Not every conversion can happen client‑side. Browsers are great at image operations, but heavier workloads (e.g., large video transcoding, speech recognition) still require server compute for good UX and reliability.

Here’s how I split the work today:

Conversion Type	Where it Runs	Technology
Image format (JPG ↔ PNG ↔ WebP)	100% browser	Canvas API + `toBlob()`
Image resize / compress	100% browser	Canvas API + (OffscreenCanvas where available)
HEIC → JPG/PNG	100% browser	WebAssembly (e.g., `heic2any`)
Video compression / conversion	Server‑side	FFmpeg
Audio format conversion	Server‑side	FFmpeg
Audio/Video → Text	Server‑side	Whisper (speech‑to‑text)
Text → Speech	Server‑side	OpenAI TTS (MP3 output)

Rule of thumb: if the Web API can handle it natively, keep it in the browser. Everything else goes to the server with strict privacy controls (e.g., temporary storage + auto‑deletion).

Client‑Side Image Conversion with Canvas API

The simplest conversion — say JPG → PNG — needs surprisingly little code:

async function convertImage(file, targetFormat) {
  const img = new Image();
  const url = URL.createObjectURL(file);

  return new Promise((resolve) => {
    img.onload = () => {
      const canvas = document.createElement("canvas");
      canvas.width = img.naturalWidth;
      canvas.height = img.naturalHeight;

      const ctx = canvas.getContext("2d");
      ctx.drawImage(img, 0, 0);

      canvas.toBlob(
        (blob) => {
          URL.revokeObjectURL(url);
          resolve(blob);
        },
        `image/${targetFormat}`,
        0.92
      );
    };

    img.src = url;
  });
}

Key gotcha: the quality parameter only applies to JPEG and WebP. PNG is always lossless — so “quality” won’t change file size much for PNG outputs.

Handling HEIC (iPhone Photos) in the Browser

HEIC has been the default photo format on iPhones for years, but most browsers can’t decode HEIC natively. For this, a WebAssembly approach works well.

I used heic2any (WASM‑based):

https://github.com/nicolo-ribaudo/heic2any

import heic2any from "heic2any";

async function convertHeic(file) {
  const blob = await heic2any({
    blob: file,
    toType: "image/jpeg",
    quality: 0.92,
  });

  return blob;
}

This runs entirely in the browser — no server upload needed — which is exactly the kind of task where browser‑first shines.

Server‑Side: Video/Audio Processing with FFmpeg

For large video compression and audio transcoding, the browser can do it in theory (WASM FFmpeg exists), but in practice it’s often:

too slow on low‑end devices,
too memory‑heavy for big files,
and hard to make reliable across browsers.

So I run FFmpeg server‑side for video/audio tasks, and focus on:

clear presets (e.g., quality vs size),
predictable outputs (e.g., MP4 H.264 as the default),
and privacy policies (auto‑deletion).

Example UX pattern that worked well: give users 3–4 “compression modes” (High Quality / Balanced / Max Compress) instead of asking them to tune bitrate and CRF on day one.

Server‑Side: AI Transcription with Whisper (Audio/Video → Text)

For speech recognition, browser‑only options still don’t match Whisper’s quality and language coverage at scale.

The key architectural decisions I made:

Process only what the user requests (no extra analysis).
Auto‑delete uploaded files after a short retention window.
Keep the API simple, and return clean text + language metadata.

Pseudo‑code sketch:

@app.post("/api/transcribe")
async def transcribe(file: UploadFile):
    temp_path = save_temp(file)

    # schedule deletion
    schedule_delete(temp_path, hours=24)

    result = whisper_model.transcribe(
        temp_path,
        task="transcribe",
        language=detect_language(temp_path)
    )

    return {"text": result["text"], "language": result["language"]}

i18n: Supporting 7 Languages Without a Framework

FastlyConvert is plain HTML + vanilla JS (no React/Next). For i18n, I used a simple attribute‑based approach:

document.querySelectorAll("[data-i18n]").forEach((el) => {
  const key = el.getAttribute("data-i18n");
  el.textContent = translations[currentLang][key] || el.textContent;
});

This supports:

English (en)
French (fr)
Japanese (ja)
Spanish (es)
Portuguese (pt)
Simplified Chinese (zh-CN)
Traditional Chinese (zh-TW)

No build step. No runtime framework. Just static pages + lightweight scripts.

Performance: Why No Build Step

Yes — it’s “old school”, but it’s extremely effective for SEO‑driven tool pages:

Users land directly on specific tools (e.g., /video-compressor, /text-to-speech)
Static files are edge‑cached (Vercel/CDN)
No framework hydration overhead
Fewer moving parts = fewer deploy surprises

The tradeoff is duplication across many HTML pages, but for a conversion suite where speed + SEO matter most, it’s a tradeoff I’m happy with.

Key Takeaways

Use Canvas API for common image conversions — no server needed.
WebAssembly (like heic2any) unlocks formats browsers can’t decode.
For heavy tasks, a hybrid browser‑first approach gives better UX than “all‑server” or “all‑WASM”.
Auto‑deletion policies are non‑negotiable for file processing products.
Vanilla HTML/JS can still win on performance for tool‑style websites.

If you’re building something similar, feel free to check out FastlyConvert and see these patterns in action:

https://www.fastlyconvert.com

What’s your approach to client‑side file processing? Have you shipped a WASM converter in production?

DEV Community