Home > Archive > MS SQL Data Warehousing > May 2005 > Incremental Load to DataWarehouse table









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Incremental Load to DataWarehouse table
marcmc

2005-04-06, 8:01 pm

I have never done it before.

By the term incremental, I mean that I have a fact table and each day I want
to add records to it that do not already exist in the table. I do not want to
truncate and do a full load each day. There are millions of rows. I am
already thinking about Last_RunDate_ID fields etc

What is the best approach to thinking about/coding this?
Any methodologies out there?

Jéjé

2005-04-10, 8:23 pm

there is no specific solution...
you have to choose the one you can do.

for example, you only add new rows which follow the last one in your table a
simple select MAX(DateKey) from MyTable can be used.
This is good for log-like or sales transactions:
each transaction as a new date, nothing to delete, only new rows to add.

if you know that sometimes you have to reload part of your data, you can cut
you big table into smaller ones (partitions) and truncate/reload only the
partition you have to load.

"marcmc" <marcmc@discussions.microsoft.com> wrote in message
news:007F3504-1E51-46CA-AFB5- FA3999F4D41A@microso
ft.com...
>I have never done it before.
>
> By the term incremental, I mean that I have a fact table and each day I
> want
> to add records to it that do not already exist in the table. I do not want
> to
> truncate and do a full load each day. There are millions of rows. I am
> already thinking about Last_RunDate_ID fields etc
>
> What is the best approach to thinking about/coding this?
> Any methodologies out there?
>



Peter Nolan

2005-05-16, 1:23 pm

Hi marcmc,

I published cobol and C code to achieve what you are asking for....and much
more....

The reason I published it is that MSFT are dropping embedded SQL in C in
2005 so it cannot be taken forward....but all the ideas in the code work. I
have implemented them in many places.....

http://www.peternolan.com/Default.aspx?tabid=57

You can always use what is there and write it in another language.

If what you really need is the ability to get the data out of a source
system incrementally (which is a different problem) and the source system
does not allow you to get incremental extracts my company provides free
utilities on win2000 (plus source code). One of them is a 'delta generation
utility' which will compare two files and generate the deltas....you can
read about it here.... http://www.instantbi.com/Default.aspx?tabid=30 under
IDW utilities.....this way you can detect changes to files and just pass the
changes through to your ETL subsystem.....another utility will detect if the
row already exists in the target table and delete it so that when the row is
loaded the loader does not crash with a constraint violation....

Best Regards

Peter Nolan
www.peternolan.com


"marcmc" <marcmc@discussions.microsoft.com> wrote in message
news:007F3504-1E51-46CA-AFB5- FA3999F4D41A@microso
ft.com...
>I have never done it before.
>
> By the term incremental, I mean that I have a fact table and each day I
> want
> to add records to it that do not already exist in the table. I do not want
> to
> truncate and do a full load each day. There are millions of rows. I am
> already thinking about Last_RunDate_ID fields etc
>
> What is the best approach to thinking about/coding this?
> Any methodologies out there?
>



Sponsored Links





Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive | Programming forum archive

Copyright 2009 droptable.com