Tyson Cung

Posted on Feb 19

How I Built a Type-Safe Excel Library with Zod

#zod #opensource #typescript

Every project I've worked on eventually needs to import or export Excel files. Customer data, reports, inventory lists, you name it. And every time, it's the same story: read the file with ExcelJS, manually map columns, write a bunch of type coercion logic, and hope the data is clean.

Spoiler: the data is never clean.

After writing this code for the fifth time, I decided to build something better.

The Problem

Here's what typical Excel parsing looks like in TypeScript:

const workbook = new ExcelJS.Workbook();
await workbook.xlsx.readFile('users.xlsx');
const sheet = workbook.getWorksheet(1);

const users = [];
sheet.eachRow((row, rowNumber) => {
  if (rowNumber === 1) return; // skip header
  users.push({
    name: String(row.getCell(1).value),
    email: String(row.getCell(2).value),
    age: Number(row.getCell(3).value),
    // is this a date or a serial number? who knows
    joinDate: row.getCell(4).value as Date,
    // "Yes", "yes", "TRUE", "1", true... all valid
    active: row.getCell(5).value === true,
  });
});

This has a few problems:

No validation. If someone puts "abc" in the age column, you get NaN and find out later.
No types. The users array is any[] unless you manually define an interface.
Brittle column mapping. If someone reorders the columns, everything breaks.
Repeated everywhere. Every Excel import needs the same boilerplate.

The Idea

I was already using Zod for API validation. Zod schemas give you both runtime validation AND TypeScript types from a single definition. What if I could use the same approach for Excel files?

Define the shape once. Get types, validation, and column mapping for free.

What excel-zod Does

npm install excel-zod zod

import { z } from 'zod';
import { createExcelSchema } from 'excel-zod';

const UserSchema = createExcelSchema(z.object({
  name: z.string(),
  email: z.string().email(),
  age: z.number().int().positive(),
  joinDate: z.date(),
  active: z.boolean(),
}));

That's your schema. Now reading an Excel file is one line:

const { data, errors } = await UserSchema.parse('users.xlsx');

data is fully typed as { name: string, email: string, age: number, joinDate: Date, active: boolean }[]. Not any[]. Real types, inferred from the Zod schema.

And errors tells you exactly what went wrong:

Row 3, Column "email": Invalid email (got: "not-an-email")
Row 5, Column "age": Number must be greater than 0 (got: -1)

Row number, column name, what the validator expected vs what it got. No more guessing.

Writing Works Too

await UserSchema.write(data, 'output.xlsx');

This creates an Excel file with:

Bold headers with background fill
Auto-sized columns
Date formatting for date columns
Boolean values formatted as "Yes" / "No"

No manual ExcelJS setup required.

Template Generation

Need to send a blank template for someone to fill in?

await UserSchema.generateTemplate('template.xlsx');

You get an Excel file with headers, a frozen header row, and data validation hints. Hand it to your ops team and they know exactly what format to use.

How It Works Under the Hood

The core is pretty simple. An ExcelSchema wraps a Zod object schema and adds Excel-specific behavior:

Reading: Headers are matched to schema keys (case-insensitive). Each cell value gets coerced to match the expected Zod type before validation. Strings that look like numbers become numbers. "Yes"/"No"/"true"/"false" become booleans. Excel serial dates become JS Date objects.

Validation: Every row goes through schema.safeParse(). Valid rows end up in data, invalid rows produce structured errors with row numbers and column names.

Column mapping: By default it's automatic. But you can override it:

const { data } = await UserSchema.parse('file.xlsx', {
  columns: [
    { key: 'name', header: 'Full Name' },
    { key: 'email', header: 'Email Address' },
  ],
});

Or map by column index if the headers are messy:

const { data } = await UserSchema.parse('file.xlsx', {
  columns: [
    { key: 'name', index: 1 },
    { key: 'email', index: 3 },
  ],
});

The Type Coercion Problem

This was the trickiest part to get right. Excel cells are loosely typed. A "number" column might contain strings, dates, or rich text objects. A "date" might be a JS Date, an ISO string, or an Excel serial number (days since Jan 1, 1900).

The coercion layer sits between ExcelJS and Zod:

Raw cell value -> Coerce to expected type -> Zod validation

For each cell, I look at the Zod schema to determine what type is expected, then try to coerce the raw value. If coercion fails, the raw value passes through to Zod, which produces a clear error message.

The key insight: unwrap Zod wrappers (optional, default, effects) to find the base type, then coerce based on that.

function unwrapZodType(schema: z.ZodTypeAny): z.ZodTypeAny {
  if (schema instanceof z.ZodOptional || schema instanceof z.ZodDefault) {
    return unwrapZodType(schema._def.innerType);
  }
  if (schema instanceof z.ZodEffects) {
    return unwrapZodType(schema._def.schema);
  }
  return schema;
}

Lessons Learned

Start with the API you want, then build backwards. I wrote the usage examples first, then figured out the implementation. If the API feels good to use, the internals can be messy (they aren't, but they could be).

Excel dates are cursed. Excel incorrectly treats 1900 as a leap year (a bug inherited from Lotus 1-2-3 in the 1980s). Every Excel date library has to account for this. Fun times.

Case-insensitive header matching is essential. People write "Name", "name", "NAME", and "nAmE". Your library should handle all of them without configuration.

Zod's safeParse is perfect for this. It gives you structured errors without throwing, which is exactly what you want when processing hundreds of rows. You don't want to stop at the first bad row.

What's Next

I'm working on a Pro version with:

Multi-sheet support (different schema per sheet)
Streaming for large files (row-by-row processing without loading everything into memory)
Column mapping with fuzzy matching and aliases
Excel templates with custom styling

The free version covers most use cases. The Pro version is for teams dealing with complex, large-scale Excel workflows.