DEV Community

zhongqiyue profile picture

zhongqiyue

404 bio not found

Location beijing,china Joined Joined on 
When Regex Fails: LLMs for Messy HTML Data

When Regex Fails: LLMs for Messy HTML Data

Comments
4 min read

Want to connect with zhongqiyue?

Create an account to connect with zhongqiyue. You can also sign in below to proceed if you already have an account.

Already have an account? Sign in
My Web Scraper Was Too Fragile — Here's How AI Fixed It

My Web Scraper Was Too Fragile — Here's How AI Fixed It

1
Comments 2
4 min read
Regex broke my scraper: Using LLMs for robust data extraction

Regex broke my scraper: Using LLMs for robust data extraction

2
Comments
5 min read
Why I Gave Up on Perfect Selectors and Asked GPT to Extract My Data

Why I Gave Up on Perfect Selectors and Asked GPT to Extract My Data

1
Comments
4 min read
I Built a Web Page Summarizer with AI (and Why You Might Not Want To)

I Built a Web Page Summarizer with AI (and Why You Might Not Want To)

1
Comments
4 min read
When Your AI Provider Fails: Building a Resilient Fallback System

When Your AI Provider Fails: Building a Resilient Fallback System

1
Comments
5 min read
Why my first RAG system hallucinated (and how I fixed it)

Why my first RAG system hallucinated (and how I fixed it)

1
Comments
4 min read
I Spent a Weekend Fighting HTML Parsing. Here's What Finally Worked

I Spent a Weekend Fighting HTML Parsing. Here's What Finally Worked

1
Comments
4 min read
Why regex wasn't enough for data extraction (and what I used instead)

Why regex wasn't enough for data extraction (and what I used instead)

Comments
4 min read
Why My AI Feature Kept Failing (And How I Fixed It)

Why My AI Feature Kept Failing (And How I Fixed It)

1
Comments 1
4 min read
Streaming AI Responses in a Serverless World: What I Learned the Hard Way

Streaming AI Responses in a Serverless World: What I Learned the Hard Way

Comments
5 min read
When Your AI Service Goes Down: Building a Multi-Model Fallback System

When Your AI Service Goes Down: Building a Multi-Model Fallback System

Comments
4 min read
How I Messed Up AI Streaming (And How You Can Avoid It)

How I Messed Up AI Streaming (And How You Can Avoid It)

Comments
4 min read
How I Cut My AI API Costs by 70% Without Sacrificing Quality

How I Cut My AI API Costs by 70% Without Sacrificing Quality

1
Comments 1
5 min read
I Gave Up on CSS Selectors: Using LLMs for Web Scraping

I Gave Up on CSS Selectors: Using LLMs for Web Scraping

Comments
5 min read
When Your AI API Keeps Timing Out: A Lesson in Async Chunking

When Your AI API Keeps Timing Out: A Lesson in Async Chunking

1
Comments
4 min read
Why My Regex-Based Parser Failed and How LLM Function Calling Saved Me

Why My Regex-Based Parser Failed and How LLM Function Calling Saved Me

2
Comments
4 min read
I Exposed My API Key Twice Before Building a Proxy — Here's What I Learned

I Exposed My API Key Twice Before Building a Proxy — Here's What I Learned

4
Comments
4 min read
Taming AI API Rate Limits with Asyncio Queues

Taming AI API Rate Limits with Asyncio Queues

1
Comments
5 min read
Why CSS selectors failed me: Using LLMs to scrape inconsistent web pages

Why CSS selectors failed me: Using LLMs to scrape inconsistent web pages

1
Comments
5 min read
My Support Bot Kept Making Stuff Up — Here's How I Fixed It

My Support Bot Kept Making Stuff Up — Here's How I Fixed It

1
Comments 1
5 min read
Fixing JSON Output from GPT: A Pattern That Actually Works

Fixing JSON Output from GPT: A Pattern That Actually Works

Comments
5 min read
Why My CSS Selectors Kept Breaking (and How LLMs Fixed It)

Why My CSS Selectors Kept Breaking (and How LLMs Fixed It)

Comments
4 min read
When Regex Fails: Using LLMs to Extract Structured Data from Messy Pages

When Regex Fails: Using LLMs to Extract Structured Data from Messy Pages

Comments
4 min read
How I stopped fighting with AI APIs and built a clean integration layer

How I stopped fighting with AI APIs and built a clean integration layer

Comments
4 min read
How I Stopped Hitting AI API Rate Limits with a Simple Async Queue

How I Stopped Hitting AI API Rate Limits with a Simple Async Queue

Comments
4 min read
I spent 3 days scraping a site until I tried LLMs for data extraction

I spent 3 days scraping a site until I tried LLMs for data extraction

2
Comments 2
6 min read
How I stopped writing regex and let AI parse my messy data

How I stopped writing regex and let AI parse my messy data

Comments
4 min read
When regex fails: extracting structured data from messy text with LLMs

When regex fails: extracting structured data from messy text with LLMs

Comments
4 min read
I stopped fighting broken parsers — here's how I use LLMs to extract web data reliably

I stopped fighting broken parsers — here's how I use LLMs to extract web data reliably

Comments
4 min read
Why My AI Kept Giving Me Invalid JSON (And How I Fixed It)

Why My AI Kept Giving Me Invalid JSON (And How I Fixed It)

Comments
4 min read
I Spent 3 Days Writing a Log Parser, Then an AI Did It in 30 Minutes

I Spent 3 Days Writing a Log Parser, Then an AI Did It in 30 Minutes

Comments
5 min read
When HTML parsing fails: using LLMs to extract messy web data

When HTML parsing fails: using LLMs to extract messy web data

Comments
4 min read
I spent 3 days writing regexes. Then I asked an AI to do it in 10 minutes.

I spent 3 days writing regexes. Then I asked an AI to do it in 10 minutes.

1
Comments
4 min read
I Spent a Month Fighting LLMs to Extract Structured Data

I Spent a Month Fighting LLMs to Extract Structured Data

Comments
4 min read
Why My AI-Powered Feature Almost Got Cancelled (And How I Fixed It)

Why My AI-Powered Feature Almost Got Cancelled (And How I Fixed It)

Comments 1
5 min read
How I stopped worrying about OpenAI rate limits (and costs)

How I stopped worrying about OpenAI rate limits (and costs)

Comments
4 min read
Failing gracefully: building a resilient AI API client with fallback

Failing gracefully: building a resilient AI API client with fallback

Comments
4 min read
I Thought I Knew Web Scraping — Until I Hit JavaScript

I Thought I Knew Web Scraping — Until I Hit JavaScript

Comments
4 min read
How I Stopped Fighting Regex and Finally Extracted Data with LLMs

How I Stopped Fighting Regex and Finally Extracted Data with LLMs

Comments
4 min read
I stopped writing regex for web scraping — here's what I do instead

I stopped writing regex for web scraping — here's what I do instead

Comments
4 min read
My AI API Kept Failing — Until I Built This Simple Client

My AI API Kept Failing — Until I Built This Simple Client

Comments
3 min read
Struggling with Text Extraction? Here’s How I Finally Cleaned Up Messy Data

Struggling with Text Extraction? Here’s How I Finally Cleaned Up Messy Data

Comments
4 min read
My web scraping nightmare ended when I let an LLM read the HTML

My web scraping nightmare ended when I let an LLM read the HTML

Comments
5 min read
Why I Gave Up on Regex and Started Using AI for Web Scraping

Why I Gave Up on Regex and Started Using AI for Web Scraping

Comments
5 min read
Regex Hell to LLM Function Calling: My Data Extraction Journey

Regex Hell to LLM Function Calling: My Data Extraction Journey

2
Comments 1
4 min read
I Spent a Weekend Fighting Flaky Scrapers — Here’s What Finally Worked

I Spent a Weekend Fighting Flaky Scrapers — Here’s What Finally Worked

Comments
5 min read
I Tried AI-Powered Web Scraping So My Selectors Could Finally Rest

I Tried AI-Powered Web Scraping So My Selectors Could Finally Rest

Comments
4 min read
I Needed a Smart Search — So I Called an AI API (No Model Training)

I Needed a Smart Search — So I Called an AI API (No Model Training)

Comments
4 min read
PR descriptions from hell: why I stopped chasing perfect AI automation

PR descriptions from hell: why I stopped chasing perfect AI automation

3
Comments 1
4 min read
I spent hours writing unit tests – so I made an LLM do it (and learned what not to do)

I spent hours writing unit tests – so I made an LLM do it (and learned what not to do)

Comments
4 min read
How I Built a Personal AI Gateway to Stop Hitting Rate Limits

How I Built a Personal AI Gateway to Stop Hitting Rate Limits

Comments
5 min read
Debugging AI Streaming: A Tale of Chunks and Timeouts

Debugging AI Streaming: A Tale of Chunks and Timeouts

Comments
4 min read
I built a simple AI proxy to cut API costs — here's what I learned

I built a simple AI proxy to cut API costs — here's what I learned

Comments
4 min read
Why I stopped hardcoding AI API calls and built a simple abstraction layer

Why I stopped hardcoding AI API calls and built a simple abstraction layer

1
Comments
4 min read
How I Finally Tamed Long Document Analysis with LLMs (It Wasn't Simple Chunking)

How I Finally Tamed Long Document Analysis with LLMs (It Wasn't Simple Chunking)

1
Comments
5 min read
How I Stopped Losing API Calls to Rate Limits (And You Can Too)

How I Stopped Losing API Calls to Rate Limits (And You Can Too)

Comments
4 min read
I Tried to Build an AI Code Reviewer Without Sharing My Code — Here's What Worked

I Tried to Build an AI Code Reviewer Without Sharing My Code — Here's What Worked

Comments
5 min read
How I Stopped Worrying and Learned to Love AI API Retries

How I Stopped Worrying and Learned to Love AI API Retries

Comments
4 min read
Struggling to Extract Structured Data from Messy Text? Here's What Finally Worked

Struggling to Extract Structured Data from Messy Text? Here's What Finally Worked

Comments
3 min read
loading...