DEV Community

SHOTA
SHOTA

Posted on

Building a Japanese-First Read-Later PWA: From Pocket Shutdown to Launch

When Mozilla shut down Pocket in July 2025, I lost my favorite tool. Worse, none of the English alternatives (Instapaper, Readwise, Matter, Raindrop) had Japanese UI, and their article extraction was mediocre on Japanese pages.

So I built one. It's called Readbox — Japanese-first, English-too, read-later as a PWA. Here's what I learned shipping it.

The stack

  • Next.js 15 App Router + TypeScript strict (no any)
  • Supabase (Postgres + Auth + RLS)
  • Stripe (JPY + USD prices, locale-routed)
  • Tailwind CSS
  • Service Worker for PWA install + offline read

Three things that bit me

1. Article extraction on Vercel serverless

First attempt: Mozilla Readability + jsdom. Doesn't bundle on Vercel because of ESM compatibility issues and the 50MB serverless function size limit. I tried 6 approaches — Webpack externals, dynamic imports, edge runtime — none worked cleanly.

Ended up using Jina Reader, which returns clean Markdown/HTML from any URL. Trade-off: third-party dependency, rate limits at scale. But it works today, and it's free.

2. Storing article body on-device

I didn't want to host millions of articles' worth of HTML on Supabase (cost + privacy). Solution: extracted HTML lives in the browser's IndexedDB only (via Dexie); only metadata (URL, title, tags, read status) syncs to the server.

Trade-off: cross-device sync of body content doesn't work seamlessly. Acceptable for a "read it later" workflow where you usually read on the device you saved on.

3. i18n routing — the silent sitemap killer

For Japanese + English from one codebase: app/[locale]/ segment with /en prefix for English (default Japanese has no prefix, to preserve old URLs).

Middleware detects cookie / Accept-Language and redirects accordingly.

The gotcha (cost me a launch-day hour): middleware matcher excludes _next, api, image extensions — but if you forget .xml/.txt/.webmanifest, sitemap.xml and robots.txt get rewritten to /ja/sitemap.xml (which doesn't exist as a route → 404).

Fix:

export const config = {
  matcher: [
    '/((?!api|_next|.*\\.(?:xml|txt|webmanifest|svg|png|jpg|jpeg|gif|webp|ico)$).*)',
  ],
};
Enter fullscreen mode Exit fullscreen mode

If you do i18n routing with metadata routes (sitemap/robots/manifest), put this in your test plan.

Where Readbox is now

Live with paid plans (¥450/mo, $3/mo, free for 100 articles). Pocket CSV import handles both 5-col legacy and 6-col new formats, up to 5,000 articles per batch.

ReadBox — Read-it-later PWA for Pocket refugees

Article text is stored on your device only. 1-click Pocket CSV import. No ads, no tracking — just reading. From $3/month.

favicon readbox.dev-tools-hub.xyz

If you migrated from Pocket, I'd love to hear what your read-later workflow looks like and what made you pick whatever you picked.

Top comments (1)

Collapse
 
harjjotsinghh profile image
Harjot Singh

Pocket shutting down is the textbook wedge: a proven habit suddenly homeless, and the English alternatives whiffing on Japanese extraction is the specific gap only you were positioned to see. Niche-language-first is a real moat because the incumbents will never prioritize your locale. The stack is almost exactly what I'd reach for, Next 15 + Supabase RLS + Stripe with locale-routed JPY/USD is the boring-but-correct spine. And the Readability-plus-jsdom-doesn't-bundle-on-Vercel-serverless pain is so universal it's almost a rite of passage, serverless and heavy DOM-parsing deps fight constantly. That extraction layer is the actual hard part of a read-later app and where the quality lives, especially on Japanese pages where naive extraction mangles the content. This is exactly the kind of build where the interesting work (Japanese-first extraction) gets buried under the plumbing (auth, billing, PWA, deploy), which is the whole problem I'm attacking with Moonshift. How'd you end up solving extraction, a hosted parser API, or a different lib that bundles cleanly?