If you're sourcing products from China — whether for dropshipping, wholesale import, or market research — you've probably spent hours manually browsing supplier platforms, copying prices into spreadsheets, and comparing MOQs across tabs.
There's a better way. In this guide, I'll walk you through everything I've learned building scrapers for China's three biggest wholesale platforms: Yiwugo.com, DHgate.com, and Made-in-China.com. We'll cover platform differences, technical challenges, working code examples, and the tools I've published on Apify Store that you can use right now.
Why Scrape China Wholesale Platforms?
Manual sourcing doesn't scale. Here's what automated data extraction unlocks:
- Price monitoring — Track price changes across thousands of products daily
- Supplier discovery — Find new suppliers matching your criteria automatically
- Competitive analysis — Compare pricing, MOQs, and product ranges across platforms
- Trend detection — Spot rising product categories before they go mainstream
- Due diligence — Screen supplier ratings, transaction history, and verification status at scale
A single scraping run can collect data that would take a human researcher weeks to compile manually.
The Three Platforms: A Quick Comparison
Before diving into the technical details, let's understand what makes each platform unique:
| Feature | Yiwugo.com | DHgate.com | Made-in-China.com |
|---|---|---|---|
| Focus | Yiwu small commodities | Global wholesale/dropship | B2B manufacturing |
| Typical MOQ | 1-100 units | 1-50 units | 100-10,000 units |
| Buyer type | Small retailers | Dropshippers, small buyers | Import companies, OEMs |
| Product range | 2M+ SKUs | 30M+ SKUs | 15M+ products |
| Language | Chinese (some English) | English | English |
| Anti-scraping | Moderate | Moderate | Low (search pages) |
| Best for | Commodity pricing data | Dropshipping research | Manufacturer sourcing |
Each platform serves a different segment of the supply chain. Scraping all three gives you the most complete picture of China's wholesale market.
Platform Deep Dive: Yiwugo.com
What is Yiwu?
Yiwugo is the official online platform of the Yiwu International Trade Market — the world's largest small commodities wholesale market. It's where factory owners and distributors in Yiwu list their products, primarily targeting domestic and international small-to-medium buyers.
Data You Can Extract
{
"title": "Stainless Steel Water Bottle 500ml",
"price": "¥8.50 - ¥12.00",
"minOrder": "100 pieces",
"shopName": "Yiwu Hengda Cup Factory",
"shopUrl": "https://www.yiwugo.com/shop/...",
"location": "Yiwu, Zhejiang",
"productUrl": "https://www.yiwugo.com/product/...",
"imageUrl": "https://img.yiwugo.com/...",
"category": "Cups & Bottles"
}
Technical Approach
Yiwugo uses server-side rendering for search results, which makes it relatively straightforward to scrape:
import { CheerioCrawler } from 'crawlee';
const crawler = new CheerioCrawler({
async requestHandler({ request, $, enqueueLinks }) {
// Extract product cards from search results
$('.pro_list_product_img2').each((i, el) => {
const title = $(el).find('.productloc a').text().trim();
nst price = $(el).find('.product_price span').text().trim();
const shop = $(el).find('.shop_name a').text().trim();
console.log({ title, price, shop });
});
// Follow pagination
await enqueueLinks({
selector: '.page_next a',
label: 'LIST',
});
},
});
await crawler.run(['https://www.yiwugo.com/product/search?keyword=water+bottle']);
Key Challenges
- Mixed language content — Product titles and descriptions are primarily in Chinese
- Price ranges — Many products show price tiers based on quantity
- Rate limiting — Aggressive scraping triggers IP blocks
- Session management — Some pages require valid session cookies
Ready-to-Use Tool
I've published a production-ready Yiwugo scraper on Apify Store that handles all of these challenges:
👉 Yiwugo Scraper on Apify Store
It supports keyword search, category browsing, pagination, and proxy rotation out of the box.
Platform Deep Dive: DHgate.com
What is DHgate?
DHgate is one of China's largest cross-border e-commerce platforms, connecting Chinese manufacturers directly with international buyers. It's particularly popular with dropshippers because of its low MOQs (often just 1 piece) and built-in buyer protection.
Data You Can Extract
{
"title": "Wireless Bluetooth Earbuds TWS 5.3",
"price": "$3.82 - $5.47",
"originalPrice": "$7.64",
"discount": "50% OFF",
"minOrder": "1 piece",
"sold": "2,847 sold",
"sellerName": "Shenzhen Digital Store",
"sellerRating": "97.8%",
"freeShipping": true,
"productUrl": "https://www.dhgate.com/product/...",
"imageUrl": "https://image.dhgate.com/..."
}
Technical Approach
DHgate renders product listings with a mix of SSR and client-side hydration. For search results, a Cheerio-based approach works well:
import { CheerioCrawler } from 'crawlee';
const crawler = new CheerioCrawler({
async requestHandler({ request, $, log }) {
const products = [];
$('.gallery-item').each((i, el) => {
products.push({
title: $(el).find('.product-title').text().trim(),
price: $(el).find('.price-current').text().trim(),
sold: $(el).find('.sold-count').text().t,
seller: $(el).find('.seller-name').text().trim(),
url: $(el).find('a.product-link').attr('href'),
});
});
log.info(`Found ${products.length} products on ${request.url}`);
},
});
await crawler.run(['https://www.dhgate.com/wholesale/search.do?searchkey=bluetooth+earbuds']);
Key Challenges
- Dynamic pricing — Prices change based on quantity tiers and promotions
- Anti-bot measures — Cloudflare protection on some pages
- Pagination limits — Search results cap at ~40 pages
- Image CDN — Product images use a separate CDN with transformation parameters
Ready-to-Use Tool
👉 DHgate Scraper on Apify Store
Handles search, category browsing, seller filtering, and automatic proxy rotation.
Platform Deep Dive: Made-in-China.com
What is Made-in-China.com?
Made-in-China.com (MIC) is a B2B platform focused on connecting international buyers with Chinese manufacturers. Unlike DHgate (which targets individual buyers), MIC is designed for bulk purchasing and OEM/ODM sourcing. It's where you find factories, not resellers.
Data Can Extract
{
"title": "CNC Machining Aluminum Parts Custom Manufacturing",
"price": "US $0.5-50 / Piece",
"minOrder": "100 Pieces",
"supplier": "Shenzhen Precision Machinery Co., Ltd.",
"supplierType": "Gold Member",
"verified": true,
"yearsOnPlatform": 8,
"location": "Guangdong, China",
"productUrl": "https://www.made-in-china.com/...",
"imageUrl": "https://image.made-in-china.com/..."
}
Technical Approach
MIC's search results are server-side rendered, making them the easiest of the three platforms to scrape:
```javascript CheerioCrawler } from 'crawlee';
const crawler = new CheerioCrawler({
async requestHandler({ request, $, log }) {
const products = [];
$('.prod-list .prod-item').each((i, el) => {
products.push({
title: $(el).find('.prod-name a').text().trim(),
price: $(el).find('.prod-price').text().trim(),
moq: $(el).find('.prod-moq').text().trim(),
supplier: $(el).find('.company-name a').text().trim(),
location: $(el).find('.company-location').text().trim(),
});
});
log.info(`Extracted ${product products`);
},
});
awaiter.run(['https://www.made-in-china.com/products-search/hot-china-products/CNC_Parts.html']);
### Key Challenges
1. **Detail page protection** — Product detail pages are behind FCaptcha verification
2. **Supplier verification data** — Some verification badges require authenticated access
3. **Contact information** — Supplier contact details are partially hidden
4. **Large result sets** — Popular categories can have 100,000+ listings
### Ready-to-Use Tool
👉 [Made-in-China Scraper on Apify Store](https://apify.com/jungle_intertwining/made-in-china-scraper)
Extracts search results with full product and supplier metadata, supports keyword search and pagination.
## Anti-Detection Best Practices
China wholesale platforms have varying levels of anti-scraping protection. Here's what works across all three:
### 1. Rotate Proxies
Never scrape from a single IP. Use residential proxies for best results:
```javascript
const proxyConfiguration = new ProxyConfiguration({
proxyUrls: [
'http://user:pass@proxy1.example.com:8080',
'http://user:pass@proxy2.example.com:8080',
],
});
Apify's built-in proxy pool handles this automatically when you run actors on the platform.
2. Respect Rate Limits
Add delays between requests to avoid triggering rate limiters:
const crawler = new CheerioCrawler({
minConcurrency: 1,
maxConcurrency: 3,
maxRequestsPerMinute: 30,
// ...
});
3. Rotate User Agents
Vary your User-Agent header to look like different browsers:
const userAgents = [
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36...',
'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36...',
'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36...',
];
4. Handle Cloudflares
When you hit Cloudflare protection (common on DHgate), switch to a browser-based approach:
import { PlaywrightCrawler } from 'crawlee';
const crawler = new PlaywrightCrawler({
headless: false, // Headed mode passes more challenges
browserPoolOptions: {
useFingerprints: true,
},
// ...
});
5. Cache and Deduplicate
Don't re-scrape data you already have. Use request queues with deduplication:
const requestQueue = await RequestQueue.open();
await requestQueue.addRequest({
url: productUrl,
uniqueKey: productId, // Prevents duplicate scraping
});
Cross-Platform Price Comparison: A Real Example
Here's a practical workflow that combines data from all three platforms:
import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_APIFY_TOKEN' });
async function comparePrices(keyword) {
// Run all three scrapers in parallel
const [yiwugo, dhgate, mic] = await Promise.all([
client.actor('jungle_intertwining/yiwugo-scraper').call({
keyword,
maxProducts: 20,
}),
client.actor('jungle_intertwining/dhgate-scraper').call({
keyword,
maxProducts: 20,
}),
client.actor('jungle_intertwining/made-in-china-scraper').call({
keyword,
maxProducts: 20,
}),
]);
// Collect results
const results = {
yiwugo: await client.dataset(yiwugo.defaultDatasetId)
.listItems().then(r => r.items),
dhgate: await client.dataset(dhgate.defaultDatasetId)
.listItems().then(r => r.items),
mic: await client.dataset(mic.defaultDatasetId)
.listItems().then(r => r.items),
};
// Comparage prices
for (const [platrm, items] of Object.entries(results)) {
const avgPrice = items.reduce((sum, item) => {
const price = parseFloat(item.price?.replace(/[^0-9.]/g, '') || 0);
return sum + price;
}, 0) / items.length;
console.log(`${platform}: ${items.length} products, avg price: $${avgPrice.toFixed(2)}`);
}
}
comparePrices('bluetooth earbuds');
For a complete cross-platform toolkit with supplier ranking and price comparison scripts, check out the GitHub repository:
👉 China Wholesale Scraper Toolkit
Common Use Cases
1. Dropshipping Product Research
Use DHgate data to find products with high sales volume and good margins. Filter by:
- Price under $10 (for 3-5x markup potential)
- Seller rating above 95%
- Free shipping available
- 100+ units sold
2. Wholesale Price Benchmarking
Compare the same product category across all three platforms to find the best wholesale price. Yiwugo typically has the lowest prices for small commodities, while Made-in-China.com offers better rates for bulk manufacturing orders.
3. Supplier Verification Pipeline
Buiomated pipeline that:
- Scrapes supplier lists from Made-in-China.com
- Filters by verification status, years on platform, and location
- Cross-references with Yiwugo for alternative suppliers
- Outputs a ranked shortlist for manual review
4. Market Trend Analysis
Run weekly scrapes across all platforms for your product categories. Track:
- New product listings (emerging trends)
- Price movements (supply/demand shifts)
- New supplier entries (market competition)
- Category growth rates
5. Competitive Intelligence
Monitor your competitors' supplier platforms. If they're sourcing from Yiwugo, you can find the same suppliers and negotiate directly — or find better alternatives on Made-in-China.com.
Legal and Ethical Considerations
Web scraping operates in a legal gray area. Here are guidelines to stay on the right side:
- Respect robots.txt — Check each platform's robots.txt before scraping
- Don't overload servers — Use reasonable rate limits and concurrency
- Public data only — Only scrape publicly accessible information
- No login circumvention — Don't bypass authentication walls
- Commercial use — Review each platform's Terms of Serviceegarding data usage
- Data storage — Handle collected data responsibly, especially supplier contact information
Getting Started
The fastest way to start collecting China wholesale data:
- Create a free Apify account at apify.com
-
Pick a scraper based on your use case:
- Yiwugo Scraper — Best for commodity pricing
- DHgate Scraper — Best for dropshipping research
- Made-in-China Scraper — Best for B2B sourcing
- Enter your search keywords and run
- Export results as JSON, CSV, or Excel
- Schedule recurring runs for ongoing monitoring
Each tool comes with detailed documentation, example outputs, and FAQ sections on their Apify Store pages.
What's Next
I'm actively developing more tools for the China wholesale ecosystem:
- Pinduoduo (拼多多) Scraper — China's largest group-buying platform (coming soon)
- Cross-platform analytics dashboard — Visual comparison of pricing trends
- Supplier scoring algorithm — Automated supplier reliability rr updates, example code, and tutorials:
- 📦 Apify Store — Yiwugo Scraper
- 📦 Apify Store — DHgate Scraper
- 📦 Apify Store — Made-in-China Scraper
- 🛠️ GitHub — China Wholesale Scraper Toolkit
- 🛠️ GitHub — Yiwugo Scraper Example
- 🛠️ GitHub — DHgate Scraper Example
- 🛠️ GitHub — Made-in-China Scraper Example
Building tools to make China wholesale data accessible to everyone. Questions or feature requests? Open an issue on GitHub or leave a comment below.
Top comments (0)