Combine Data from Multiple Databases Into 1 Table
Question: if I have different customers, each with their own DBs with similar structure, how do I combine/append them into a single data-warehouse table?
- C1-T1
- C2-T1
- C3-T1
- All goes into DW-T1, with additional
customer_id
column
Approach: multiple data imports, and a transform/view
We suggest to create multiple data imports to load data from different customers tables into different tables, like:
- C1-T1 go to
dw.c1_t1
- C2-T1 go to
dw.c2_t1
- C3-T1 go to
dw.c3_t1
And create a data transform that governs these customer's data:
-- data transform that writes to dw.master_t1 table
select *, 'c1' as customer_id
from dw.c1_t1
union all
select *, 'c2' as customer_id
from dw.c2_t1
union all
select *, 'c3' as customer_id
from dw.c3_t1
You can also convert this data transform into a incremental transform, so that each time it'll only capture the latest data for each customer table.
This approach is modular, reducing the chance of data inconsistency, and is also easy to modify at every step along the way.
So the total steps needed are:
- Create N data imports for N customers tables
- Create 1 data transform (query above) to group N tables into 1 master table