Understanding NoSQL → SQL
Documents into rows — the migration nobody wants twice.
How embedded arrays become tables, the queries that break, and the lift-and-shift shortcut via JSONB.
Why the migration happens.
A team picked MongoDB for the early prototype because document storage matched the data shape. Three years later, the queries got complex, the joins got expensive in application code, and reporting needs caught up. Moving to PostgreSQL recovers transactions, joins, and analytic queries. The reverse migration is rare; the forward one (NoSQL → SQL) is common enough that it deserves a playbook.
Normalise the embeds.
A document with an array of comments turns into two tables: posts and comments, with a post_id FK on comments. A document with embedded address fields stays in the same row (an address has one-to-one with a user). A document with an embedded array of tags becomes a third table — tags + post_tags junction. The rule: one-to-one stays inline, one-to-many or many-to-many becomes a join.
Lift and shift via JSONB.
PostgreSQL's JSONB column lets you migrate without restructuring. Take the document, stuff it into a single column, add B-tree indexes on the fields you query often. The result has roughly Mongo-equivalent semantics on the read side. Use it as a stop-gap or for documents that don't really need to be normalised; don't use it for everything or you give up most of why you moved.
The queries that change.
MongoDB's find({ "comments.author": "alice" }) on an embedded array becomes a SQL SELECT posts.* FROM posts JOIN comments ON comments.post_id = posts.id WHERE comments.author = 'alice'. The query is more verbose; the planner can use indexes; the result includes duplicates if a post has multiple matching comments (SELECT DISTINCT or GROUP BY). Mongo's $lookup is the closest equivalent to a SQL join; queries that relied on it migrate naturally.
A worked translation.
Mongo: { _id: "p1", title: "Hello", tags: ["a","b"], comments: [{ author: "alice", text: "Hi" }] }. Postgres: posts(id, title), post_tags(post_id, tag), comments(post_id, author, text). Migration script: iterate Mongo collection, insert into posts, fan out tags and comments. Index on comments.post_id, post_tags.post_id, post_tags.tag. Reads are ~3 joins; the planner handles it.
Post document → 3 tables
embed → join
Normalise arrays, index FKs.
1 doc → 1 post + N tags + N comments
= 3 tables, 3 indexes
What you give up, what you gain.
Lose: schema-on-read (you have to migrate now), horizontal sharding ergonomics (Postgres scales vertically before Citus), single-document atomic upserts. Gain: ACID transactions across rows, foreign-key integrity, the SQL ecosystem (BI tools, reporting, ORMs, dbt), the query planner. For typical OLTP workloads under a few TB, the gains usually win; past that, the calculus shifts.