I like vendor benchmarks as they are often good illustrations of worst practice. Rather than focusing on real implementation trade-offs, they pick an extreme case that favors their own implementation by chance but is not optimized in the competitor's — because it is simply not how that technology is meant to be used. For example, Oracle published "2x" benchmark results using YCSB, a key-value benchmark (slide 14 here), and a "529x" test misusing the raw BSON purpose (here). Both compare a relational database against a document database by using neither technology as it was designed to be used — no normalized tables, no nested documents, just flat key-value fields. The latter test is even worse: it uses 1,000 top-level fields in a single document. Don't do that!
While having thousands of columns in an SQL table is usually undesirable, it's acceptable to have hundreds or even thousands of fields in a document, as they represent multiple entities and value objects. However, nobody creates a flat structure with 1,000 top-level fields. The advantage of JSON is its ability to organize entities into nested sub-documents. For instance, instead of storing first_name and last_name as separate fields, you can have a name field containing a sub-object with first and last. During queries, using dot notation to reference name.first and name.last makes no difference from first_name or last_name. It simplifies reading and displaying the document, and, as that is how it should be used, the binary JSON formats are optimized for it.
To illustrate this, I created two collections, flat with one thousand fields at the top level, and nest with ten top-level fields, each containing a sub-object with ten sub-objects.
{
echo field{0..9}{0..9}{0..9}
for i in {1..100000} ; do echo {0..9}{0..9}{0..9} ; done
} | sed 's/ /,/g' |
mongoimport --collection "flat" --type=csv --headerline --drop
{
echo field{0..9}.sub{0..9}.sub{0..9}
for i in {1..100000} ; do echo {0..9}{0..9}{0..9} ; done
} | sed 's/ /,/g' |
mongoimport --collection "nest" --type=csv --headerline --drop
The flat documents are like:
{ _id: ObjectId('6a1a1ed41d55219677a79c22'),
field000: 0, field001: 1, field002: 2, field003: 3, field004: 4, field005: 5, field006: 6, field007: 7, field008: 8, field009: 9,
field010: 10, field011: 11, field012: 12, field013: 13, field014: 14, field015: 15, field016: 16, field017: 17, field018: 18, field019: 19,
field020: 20, field021: 21, field022: 22, field023: 23, field024: 24, field025: 25, field026: 26, field027: 27, field028: 28, field029: 29,
field030: 30, field031: 31, field032: 32, field033: 33, field034: 34, field035: 35, field036: 36, field037: 37, field038: 38,
...
The nested documents are like:
{
_id: ObjectId('6a1a160041c5538a7e93a9c5'),
field0: {
sub0: { sub0: 0, sub1: 1, sub2: 2, sub3: 3, sub4: 4, sub5: 5, sub6: 6, sub7: 7, sub8: 8, sub9: 9 },
sub1: { sub0: 10, sub1: 11, sub2: 12, sub3: 13, sub4: 14, sub5: 15, sub6: 16, sub7: 17, sub8: 18, sub9: 19 },
sub2: { sub0: 20, sub1: 21, sub2: 22, sub3: 23, sub4: 24, sub5: 25, sub6: 26, sub7: 27, sub8: 28, sub9: 29 },
sub3: { sub0: 30, sub1: 31, sub2: 32, sub3: 33, sub4: 34, sub5: 35,
...
They hold the same values, with a different field structure.
BSON serialization format optimization
BSON is designed for efficient sequential streaming through network or disk access. Each field is stored one after another, with a small header containing the field type and name, followed by the value. Because every value type has either a fixed size (like a 64-bit integer or a double) or an explicit length prefix (like a string or a binary blob), the parser can skip over any field it doesn't care about without reading its value at all — it simply jumps forward by the declared length.
Nested sub-documents and arrays are stored the same way: the type is object or array, followed by the total byte length of the entire nested structure, followed by the nested fields themselves (for arrays, the field names are simply "0", "1", "2", and so on). That length prefix is the key optimization: if the parser is looking for a top-level field and the current field is a sub-document that doesn't match, it can skip the entire sub-document in one jump — all its nested fields, however deep — without reading a single byte inside it.
When you query a field using dot notation, such as "name.first", the parser works level by level. It scans the top-level fields looking for name. Any top-level field that doesn't match name is skipped, including any sub-documents, in a single jump. Once name is found and confirmed to be of type object, the parser steps into that sub-document and begins scanning its fields for first, again skipping anything that doesn't match. It never needs to read the sibling sub-documents of name at all.
The worst case is a field that doesn't exist: the parser must scan every field at the relevant nesting level before concluding it isn't there. But even then, sub-documents and arrays that are clearly irrelevant — because their parent name doesn't match — are each skipped in a single operation regardless of how large or deeply nested they are.
This is precisely why organizing many fields into a nested hierarchy is not just a cosmetic preference. A flat document with 1,000 top-level fields forces the parser to read 1,000 field names in the worst case. A document with 10 top-level fields, each containing 10 sub-fields, each containing 10 values, holds the same 1,000 values but the parser only ever reads at most 10 + 10 + 10 = 30 field names to locate any one of them. The BSON length prefix turns the nested structure into an implicit index, making the format genuinely faster to navigate at scale.
With the collection flat that has one thousand top-level fields from "field000" to "field999", looking for the last field "field999" takes more than one second:
db.flat.find(
{ "field999" : {$lt:42} }
).explain("executionStats").executionStats
;
{
executionSuccess: true,
nReturned: 0,
executionTimeMillis: 1081,
totalKeysExamined: 0,
totalDocsExamined: 100000,
executionStages: {
isCached: false,
stage: 'COLLSCAN',
filter: { field999: { '$lt': 42 } },
nReturned: 0,
executionTimeMillisEstimate: 1070,
works: 100001,
advanced: 0,
needTime: 100000,
needYield: 0,
saveState: 64,
restoreState: 64,
isEOF: 1,
direction: 'forward',
docsExamined: 100000
}
}
With the collection nest that has ten top-level fields from "field0" to "field9", and two levels of ten sub-objects from "sub0" to "sub9", looking for the last field "field9.sub9.sub9" is twice as fast:
db.nest.find(
{ "field9.sub9.sub9" : {$lt:42} }
).explain("executionStats").executionStats
;
{
executionSuccess: true,
nReturned: 0,
executionTimeMillis: 424,
totalKeysExamined: 0,
totalDocsExamined: 100000,
executionStages: {
isCached: false,
stage: 'COLLSCAN',
filter: { 'field9.sub9.sub9': { '$lt': 42 } },
nReturned: 0,
executionTimeMillisEstimate: 420,
works: 100001,
advanced: 0,
needTime: 100000,
needYield: 0,
saveState: 21,
restoreState: 21,
isEOF: 1,
direction: 'forward',
docsExamined: 100000
}
}
A major part of the time is fetching the document from storage. Navigating into it is a smaller part, but the difference is still clearly visible: 1,081 ms versus 424 ms for 100,000 documents, with the same number of values, and the same worst-case field to find: the last one. The only difference is how those fields are arranged. Nesting them three levels deep, with ten branches at each level, makes the collection scan faster — with no index, no schema change, and no query rewrite beyond the dot notation that you would use anyway.
This is not a micro-optimization or an edge case. Any collection scan, whether triggered by a missing index, a low-selectivity filter, or a background analytics query, pays this cost on every document it reads. The deeper and wider your documents are, the more the BSON length-prefix trick pays off, because the parser can leap over entire branches of the document tree in a single bounds check.
What about Oracle's OSON format?
Oracle Database stores JSON documents in its proprietary binary format called OSON. At first glance, OSON appears to take a fundamentally different approach: rather than storing field names inline with each value as BSON does, OSON builds a field name dictionary at the beginning of the document. Think of it like a mini-datastore with its catalog and indexes, rather than a protocol buffer. Each field in the document body then refers to its name by a short numeric ID rather than repeating the full string. This makes individual field names cheaper to compare and reduces document size when the same field name appears many times.
Given that design, you might expect nesting to make no difference in Oracle: if every field is just a numeric ID anyway, scanning 1,000 flat fields should cost the same as scanning 10 + 10 + 10 fields spread across three levels. The dictionary lookup cost is the same either way.
In practice, however, the same experiment on Oracle Database shows a very similar result to MongoDB.
Querying field999 on the flat table takes about 1.6 seconds:
set autotrace traceonly
select /*+ gather_plan_statistics */ "DATA" from "ORA"."flat"
where JSON_EXISTS("DATA",'$?(@.field999.numberOnly() < $B0)'
passing 42 as "B0" type(strict))
;
PLAN_TABLE_OUTPUT
________________________________________________________________________________________________
SQL_ID 1h98k751w7dws, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ "DATA" from "ORA"."flat" where
JSON_EXISTS("DATA",'$?(@.field999.numberOnly() < $B0)' passing 42 as
"B0" type(strict))
Plan hash value: 4073748891
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 0 |00:00:01.60 | 301K| 739 |
|* 1 | TABLE ACCESS FULL| flat | 1 | 12 | 0 |00:00:01.60 | 301K| 739 |
---------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(JSON_EXISTS2( /*+ QJSNMD_USE_SHR_XCTX */ "DATA" /*+ LOB_BY_VALUE */
FORMAT OSON , '$?(@.field999.numberOnly() < $B0)' /* json_path_str
$?(@.field999.numberOnly() < $B0) */ PASSING 42 AS "B0" FALSE ON ERROR
TYPE(STRICT) )=1)
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
Statistics
-----------------------------------------------------------
117 CPU used by this session
117 CPU used when call started
160 DB time
1174164 RM usage by this session
3 Requests to/from client
738 Session total flash IO requests
6053888 cell physical IO interconnect bytes
301074 consistent gets
301074 consistent gets from cache
301074 consistent gets pin
1068 consistent gets pin (fastpath)
739 gcs data block access records
1 messages sent
747 non-idle wait count
48 non-idle wait time
2 opened cursors cumulative
1 opened cursors current
738 physical read total IO requests
6053888 physical read total bytes
2 process last non-idle time
301074 session logical reads
48 user I/O wait time
4 user calls
Querying field9.sub9.sub9 on the nested table takes about 1.0 second:
select /*+ gather_plan_statistics */ "DATA" from "ORA"."nest"
where JSON_EXISTS("DATA",'$?(@.field9.sub9.sub9.numberOnly() < $B0)'
passing 42 as "B0" type(strict))
;
PLAN_TABLE_OUTPUT
__________________________________________________________________________________________________
SQL_ID 7yuwrup6rscrc, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ "DATA" from "ORA"."nest" where
JSON_EXISTS("DATA",'$?(@.field9.sub9.sub9.numberOnly() < $B0)'
passing 42 as "B0" type(strict))
Plan hash value: 3225864993
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 0 |00:00:01.04 | 100K| 1143 |
|* 1 | TABLE ACCESS FULL| nest | 1 | 1000 | 0 |00:00:01.04 | 100K| 1143 |
---------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(JSON_EXISTS2( /*+ QJSNMD_USE_SHR_XCTX */ "DATA" /*+ LOB_BY_VALUE */
FORMAT OSON , '$?(@.field9.sub9.sub9.numberOnly() < $B0)' /* json_path_str
$?(@.field9.sub9.sub9.numberOnly() < $B0) */ PASSING 42 AS "B0" FALSE ON ERROR
TYPE(STRICT) )=1)
Statistics
-----------------------------------------------------------
40 CPU used by this session
40 CPU used when call started
106 DB time
404934 RM usage by this session
4 Requests to/from client
1125 Session total flash IO requests
9363456 cell physical IO interconnect bytes
100221 consistent gets
100221 consistent gets from cache
100221 consistent gets pin
99096 consistent gets pin (fastpath)
1142 gcs data block access records
158 global enqueue gets sync
158 global enqueue releases
1134 non-idle wait count
71 non-idle wait time
2 opened cursors cumulative
1 opened cursors current
1125 physical read total IO requests
9363456 physical read total bytes
61 process last non-idle time
100221 session logical reads
71 user I/O wait time
4 user calls
The most telling number is not the execution time but the Buffers column: 301,074 consistent gets for the flat table versus 100,221 for the nested table. Oracle is reading three times as many data blocks for the flat documents. To understand why, I check the actual segment sizes:
with
table_seg (owner,table_name, table_blocks) as ( select owner, segment_name, sum(blocks) from dba_segments group by owner,segment_name),
lob_seg (owner,table_name, lob_blocks) as ( select owner, table_name, sum(blocks) from dba_lobs join dba_segments using (owner, segment_name) group by owner,table_name),
tab_stats (owner,table_name, avg_row_len) as ( select owner, table_name, avg_row_len from dba_tables)
select owner, table_name, table_blocks, nvl(lob_blocks, 0) as lob_blocks, table_blocks + nvl(lob_blocks, 0) as total_blocks, avg_row_len
from tab_stats natural join table_seg natural left join lob_seg
where owner = 'ORA' and table_name in ('flat', 'nest')
;
OWNER TABLE_NAME TABLE_BLOCKS LOB_BLOCKS TOTAL_BLOCKS AVG_ROW_LEN
________ _____________ _______________ _____________ _______________ ______________
ORA flat 1,152 317,608 318,760 166
ORA nest 100,992 32 101,024 7,724
The segment query reveals the full story. The nested documents have an average row length of 7,724 bytes and are stored entirely inline in the table segment's blocks, with almost no LOB blocks. The flat documents have an average row length of only 166 bytes in the table segment, but they spill into 317,608 LOB blocks. That tiny average row length for flat is not a sign of small documents — it is the sign of a pointer because large documents do not fit in fixed-sized table blocks. The actual OSON bytes have been pushed out of the row into a separate LOB segment managed by Oracle's securefile infrastructure, and every document access requires an additional pointer dereference to fetch them from there.
The root cause is the OSON dictionary. OSON stores each distinct field name only once in a per-document dictionary and replaces every occurrence in the document body with a short numeric ID. The dictionary itself is the mechanism that compresses the field names. A flat document with 1,000 entirely unique field names like field000 through field999 requires a dictionary with 1,000 entries, one per distinct string, with no repetition to exploit. The resulting OSON document is large enough to exceed Oracle's block size and gets stored out of row as a LOB. A nested document whose entire structure uses only 20 distinct names — field0 through field9 and sub0 through sub9 — has a dictionary with just 20 short entries. Furthermore, objects that share the same field structure can reuse each other's field ID array entirely rather than repeating it. The resulting document is compact enough to be stored inline, directly in the table block, with no LOB indirection at all.
The consequence is dramatic here. Scanning the flat collection means fetching a pointer from the table block, then chasing it to a LOB block, for every single one of the 100,000 documents. Scanning the nested collection means reading the document directly from the table block. That is the difference between 301,074 consistent gets and 100,221.
So the two formats reach the same conclusion by different paths. BSON benefits from nesting because the length-prefix on sub-documents lets the parser skip entire branches with a single jump, reducing the number of field names it reads. OSON benefits from nesting because the dictionary compresses repeated field names, shrinking the documents enough to keep them inline in the table and avoiding the cost of LOB storage entirely.
JSONB and BSON on PostgreSQL
JSONB takes a different approach from BSON. Rather than storing fields in insertion order with length-prefix skipping, JSONB sorts keys at each nesting level and adds a header describing the offset and type of every value at that level, enabling binary search instead of sequential scanning. Because this structure is built independently per level, a nested document gets a short sorted list at each level rather than one long list at the top. The benefit of nesting is the same as with BSON, just through a different mechanism.
I imported the same data into PostgreSQL:
CREATE TABLE flat (data JSONB);
CREATE TABLE nest (data JSONB);
\copy flat (data) FROM PROGRAM 'mongoexport --collection flat'
\copy nest (data) FROM PROGRAM 'mongoexport --collection nest'
VACUUM ANALYZE flat, nest
;
The query on the flat structure takes more than two seconds:
EXPLAIN (ANALYZE, BUFFERS, VERBOSE, COSTS OFF)
SELECT * FROM flat
WHERE (data->>'field999')::int < 42;
QUERY PLAN
----------------------------------------------------------------------------
Seq Scan on public.flat (actual time=2138.111..2138.111 rows=0.00 loops=1)
Output: data
Filter: (((flat.data ->> 'field999'::text))::integer < 42)
Rows Removed by Filter: 100000
Buffers: shared hit=400637
Planning Time: 0.043 ms
Execution Time: 2138.126 ms
(7 rows)
The query on the nested structure is twice as fast:
EXPLAIN (ANALYZE, BUFFERS, VERBOSE, COSTS OFF)
SELECT * FROM nest
WHERE (data->'field9'->'sub9'->>'sub9')::int < 42;
QUERY PLAN
------------------------------------------------------------------------------------------------
Seq Scan on public.nest (actual time=1292.003..1292.003 rows=0.00 loops=1)
Output: data
Filter: (((((nest.data -> 'field9'::text) -> 'sub9'::text) ->> 'sub9'::text))::integer < 42)
Rows Removed by Filter: 100000
Buffers: shared hit=400637
Planning Time: 0.041 ms
Execution Time: 1292.017 ms
(7 rows)
2,138 ms versus 1,292 ms, same documents, same values, same worst-case field. The difference is purely the cost of navigating the JSONB structure. The nested document wins because each level presents the parser with a short sorted list of at most 10 keys rather than a single list of 1,000. It is interesting to see that both tables have the same values but different field names, due to nesting, finally have exactly the same number of pages (shared hit=400637) in JSONB/TOAST.
PostgreSQL can also store documents in BSON natively via the DocumentDB extension, which uses the same BSON library as MongoDB. I loaded the same data there to confirm the result holds.
DocumentDB is a PostgreSQL database, so I ran the JSONB queries above on the same instance (the PostgreSQL endpoint port is 9712 in DocumentDB docker container by default). Then I used the DocumentDB extension API to import the same data from JSONB tables to DocumentDB collections:
postgres=# \dx documentdb*
List of installed extensions
Name | Version | Schema | Description
-------------------------+---------+--------+-----------------------------------------------
documentdb | 0.111-0 | public | API surface for DocumentDB for PostgreSQL
documentdb_core | 0.111-0 | public | Core API surface for DocumentDB on PostgreSQL
documentdb_extended_rum | 0.111-0 | public | DocumentDB Extended RUM index access method
(3 rows)
-- Insert from the PostgreSQL JSONB tables into DocumentDB collections
SELECT documentdb_api.insert_one('db', 'flat', document)
FROM (SELECT data::text::documentdb_core.bson AS document FROM flat) t;
SELECT documentdb_api.insert_one('db', 'nest', document)
FROM (SELECT data::text::documentdb_core.bson AS document FROM nest) t;
I connected to the same DocumentDB database with the MongoDB-compatible API (the MongoDB endpoint port is 10260 in DocumentDB docker container by default).
The query on the flat structure with one thousand fields took 3 seconds:
db.flat.find(
{ "field999" : {$lt:42} }
).explain("executionStats").executionStats
;
{
nReturned: Long('0'),
executionTimeMillis: 3285.758,
executionStartAtTimeMillis: 3285.758,
totalDocsExamined: Long('100000'),
totalKeysExamined: Long('0'),
executionStages: {
stage: 'COLLSCAN',
nReturned: Long('0'),
executionTimeMillis: 3285.758,
executionStartAtTimeMillis: 3285.758,
totalDocsExamined: 100000,
totalKeysExamined: 0,
totalDocsRemovedByRuntimeFilter: 100000,
numBlocksFromCache: 374177,
numBlocksFromDisk: 76758
}
}
The query on the nested structure is twice as fast:
db.nest.find(
{ "field9.sub9.sub9" : {$lt:42} }
).explain("executionStats").executionStats
;
{
nReturned: Long('0'),
executionTimeMillis: 1401.24,
executionStartAtTimeMillis: 1401.24,
totalDocsExamined: Long('100000'),
totalKeysExamined: Long('0'),
executionStages: {
stage: 'COLLSCAN',
nReturned: Long('0'),
executionTimeMillis: 1401.24,
executionStartAtTimeMillis: 1401.24,
totalDocsExamined: 100000,
totalKeysExamined: 0,
totalDocsRemovedByRuntimeFilter: 100000,
numBlocksFromCache: 349453,
numBlocksFromDisk: 51482
}
}
3,286 ms versus 1,401 ms — consistent with the MongoDB result and explained by the same BSON length-prefix mechanism. DocumentDB stores documents in BSON on top of PostgreSQL storage, so the nesting optimization is identical to native MongoDB when looking at a field value.
Conclusion
| Format | Navigation mechanism | Why nesting helps |
|---|---|---|
| BSON | Sequential scan with length-prefix skip | Skips entire sub-documents in one jump |
| JSONB | Binary search on sorted keys per level | Keeps each sorted list short |
| OSON | Dictionary lookup | Reduces dictionary size |
The underlying mechanisms differ, but the guidance is the same: documents are nested structures that aggregate multiple entities and value objects. Don’t chase performance with documents containing 1,000 top-level fields — fix the data model first.
Model your documents the way you naturally think about the data — as a hierarchy of related objects — and the binary JSON format your database uses will reward you for it. Whether that is BSON in MongoDB, BSON via the DocumentDB extension for PostgreSQL, JSONB, or OSON, flat documents with hundreds of top-level fields are not just harder to read — they are measurably slower to query and heavier on storage. Nesting is not just good document design. It is how these formats are engineered to be used.
Top comments (9)
Franck, I appreciate the writeup, but it argues my case for me.
Every benchmark in it compares flat-vs-nested within one engine — Mongo flat vs Mongo nested, Oracle flat vs Oracle nested. The comparison that actually tests the claim is OSON-nested vs BSON-nested at the same shape, and you skipped it. I didn't. It's in the article you're responding to (methodology detailed below):
Nesting is your recommended fix, and in a head-to-head it makes BSON relatively worse, not better — the gap widens with depth, because every level is another potential full scan for BSON (the test above executes depth traversal on the first attribute, best case for BSON) and just another single jump for OSON. Your own example shows the same thing at micro scale: your 10x10x10 doc reads "at most 10+10+10 = 30 field names." On the identical document OSON does 3 dictionary-ID lookups and 3 offset jumps. 30 sequential string comparisons versus 3 integer lookups — on the structure you're telling people to adopt as the cure. You literally reproduced the O(n) vs O(1) result and relabeled it as a rebuttal.
That's the tell in the advice itself. "Nest your documents and BSON gets faster" is a workaround you only ever need for a format that pays per field. Nobody gives that advice to OSON users, because OSON's access cost doesn't move with shape — flat or five deep, hash and jump. Oracle's VLDB 2020 paper calls it exactly that: tree pointers as jump-navigation offsets. Having to reshape your data to dodge the scan is the scan tax. Not having the tax is the architecture.
Two cleanups. Your "flat OSON is 3x the buffer reads" is a 1000-unique-name dictionary crossing the out-of-row threshold and spilling to a LOB — a storage boundary, not field-access complexity; a nested doc with 1000 distinct names spills the same way. And the 1000-field windmill you keep tilting at is the X-axis endpoint of a scaling curve — the textbook way to demonstrate complexity by varying input. The gap is already 28.6x at position 50, a field count you'd call realistic. Splitting 50 fields into 5 levels of 10 doesn't remove the scan; BSON does up to 10 string compares per level, OSON does 1 jump. The tax just gets redistributed.
BSON was a reasonable 2009 design from a team shipping a PaaS that pivoted into a database. OSON was designed in 2017 by the people who wrote the SQL/JSON standard, with eight years of BSON's limitations in front of them, and they chose jump-navigation over sequential scan on purpose — to kill exactly this cost. "Model it as a hierarchy and the format rewards you" is true. It's just more true for the format that doesn't make you do it to stay fast.
Methodology (DocBench, BsonVsOsonClientSideTest.testNestedFieldAccess): each object carries 10 sibling "padding" fields plus one structural field; depths 1/3/5; identical documents and identical path-navigation code for both formats — BSON via RawBsonDocument, OSON via OracleJsonValue, no driver-side HashMap shortcut. The "target" leaf sits at position 1 (best case for BSON); only the structural "nested" descent links sit deep (last position, after the 10 padding fields), so BSON scans ~10 siblings before each hop while OSON does one hash lookup. That is a modest, realistic fan-out, not a 1000-field strawman.
Worked out: ~14 / ~36 / ~58 BSON string comparisons at depth 1/3/5 vs 3 / 5 / 7
OSON jumps — which is the 5.4x / 11.9x / 12.6x in the table.
Source: github.com/rhoulihan/DocBench
If OSON is designed and optimized for thousands of top-level fields, that’s really cool for some workloads, like storing massive spreadsheets. But I still wouldn’t recommend that model for document applications. You can doesn’t mean you should. The same applies to relational tables: you can create 4,096 columns in an Oracle table, it works, but the engine is not primarily optimized for that many row pieces.
I agree that tests should be grounded on the same hardware. That is why I did all my tests using Oracle 26ai Free version running on the exact same hardware as MongoDB 8.1.
The whole point of your article is nest the data in the document and it gets faster. The fact that you did not explicitly say the words is irrelevant, it's what you are demonstrating.
OSON is optimized for accessing individual values within a document. Because of this it does not matter where the field is. Field 1 or 10000000000 the timing is the same, that is the point. BSON requires a linear sequential scan across fields until it finds the attribute name that matches an equality test. That is extremely heavy and expensive.
This is the difference between a properly engineered protocol and a hack produced during a product pivot. BSON was built to store and serve up full documents for a content management platform called 10gen. It was folded into a database under pressure to produce a product that the company could sell which became MongoDB. OSON was designed by a team of data scientists and engineers with decades of experience in database technology. They are not equivalent in any way and even the simplest tests expose this.
When I was at MongoDB I used BSON like a club to beat up RDBMS platforms like Postgres that claimed to support documents with bolt on technologies like JSONB while burning through CPU on JSON serialization as if its free. And yes I said bolt-on, that is exactly what JSONB is. You call it "plugin" but that is just another name for a feature that was not an integral part of the original design, e.g. bolt-on.
The fact is OSON flipped the script on MongoDB and that is why I am here at Oracle now. A proven database with a real CBO and 40 years of optimized execution planning that supports a true binary document protocol all of MongoDB's internal warts are suddenly exposed. The fact is there was a ton of technical debt that BSON protocol efficiency papered over. Unfortunately that paper is now peeling and the mold underneath is becoming visible.
Bolt-on? PostgreSQL was explicitly designed in 86 to store complex objects in addition to basic datatypes found in previous RDBMS. Objects, vectors, XML, JSON... those followed the trends but the idea is the same: the hooks for extensible datatypes, indexing, and storage were there explicitly to avoid bolt-on solutions later. You said "40 years of CBO"? 40 years ago, DB2 had great CBO, but Oracle was rule based. CBO started to be used in data warehouses 25 years ago, and later in OLTP (but forcing index cost lower). By the way neither 10gen or Oracle have invented binary serialization formats with variable length content, headers, and indexes. Devices exchange such formats for decades (Telco call data records or whatever).
You say tomato, I say tomahto.
Oracle has only 34 years of CBO history, so I guess that makes my entire argument incorrect. I must be a fool.
Tossing out Red Herring distractions in the comments does not make the article above any more correct.
BSON is inferior to OSON in every way that can be measured. This is fact.
Also, on the JSONB thing...while technically it is not a plugin it certainly has not been in the Postgres distribution since '86. It first appeared as part of the release in 9.4 which was released in '92. So I guess everything you say must also be wrong now that you stated an esoteric historical fact incorrectly as I did and now we are both just idiots talking at each other. (I of course do not believe that for one second...you are extremely sharp)
While JSONB is compiled natively into the Postgres engine today, the reality is that it was frankensteined out of a 10-year-old key-value extension called hstore. The development team openly admitted that after a year of trying to make nested data work in an extension format, they had to frantically pivot, rip out the code, and merge it into the core database engine under extreme fatigue to hit a release deadline. It is fully integrated, but it is built on old structural bones. That legacy foundation is exactly why it suffers from heavy write-amplification, lack of in-place updates, and terrible query planner selectivity estimates today.
Compiled in or not, JSONB is a bolt-on by any reasonable definition.
I didn’t say “the JSONB thing” — I was referring to the extensibility framework that made JSONB (and many other features) possible in the first place. The key idea is that these capabilities can be plugged directly into the storage layer and catalog, rather than bolted on top, and that the system supports custom data types and indexes at all layers. This was already described in 1986 in THE DESIGN OF POSTGRES as “access methods” (with the first design principle being to “provide better support for complex objects”). The closest thing in Oracle Database is Data Cartridge (around 8i).
This is why I will always enjoy our banter, Franck. Accuracy is what its all about.
We may often disagree, but we will always agree on that even when we have different opinions on what accuracy is.
In this case my assumption as the reader and yours as the author were inconsistent which generated hallucinations. :)
Nested vs flat documents — I feel this in CSV/JSONL logging for chat+call logs. How do you version schema changes without breaking replay?