Forum

 
ForumForumDiscussionsDiscussionsSQLBI Methodolo...SQLBI Methodolo...What belongs in the veiw, and what belongs in the ETL?What belongs in the veiw, and what belongs in the ETL?
Previous Previous
 
Next Next
New Post
 7/23/2009 5:36 PM
 

 I'm interested in hearing your opinions on this.  We have different source systems based on Region that we pull into our staging area.  Once in staging, we have the complete set of data from all regions to create a consolidated "Customer" dimension.  We have two choices:

1) Create a view that "Unions" all of the separate region data for customer and use this view in our SSIS package to load a Customer Dimension, or

2) Create separate views for each regional Customer table and do the Merge/UNION and all other mapping, conforming, etc. within our SSIS packages, or

3) Do not create any views at all, but select directly from the tables within the SSIS ETL process (similar to #2, but using tables instead of views)

What are your opinions on this matter?  Should views always be a one-to-one relationship with their tables, or should they be used to combine data from several sources for use within the ETL process?


Thanks!

(BTW, I just ordered your new book "Expert Cube Development", and look forward to reading it!)

-Kory

New Post
 7/24/2009 10:30 AM
 

It is always hard, when introducing an indirection level, to decide if the business logic should be put inside it or not. So, your question is a really interesting one.

From the performance point of view, there are clearly no differences between the three solution, so the decision should be take looking at the architectural point of view.

I like most the second solution, since the separation between different regions should be clear to the SSIS developer. If we hide the UNION inside a view in the SQL database, then we will end up developing the SSIS package without having the knowledge that different sources are available. If, for example, you need to perform a slightly different processing the the EUROPEAN customers, then the best place where to make it is in the SSIS package. In order to do that, you need to have a separate flow for EUROPEAN in the SSIS package.

If you UNION all inside the SQL VIEW, then that processing need to be made inside the view and this is definitely not the right place where to carry on ETL steps.

Of course, this is my personal opinion, based on the few information I have about the system you are building. :)

Alberto

P.S. Hope you will enjoy reading it, let us know what you think about the book after having read it. :)

New Post
 7/24/2009 10:36 AM
 

I agree with Alberto.

I would not choose #3 (direct table) because of the lack of a decouple layer. And I would not choose #1 (UNION in view) because it could be a performance issue if the data are coming from different databases (in case you don't use a single staging db) and if you will use that VIEW in some join (for example, with another view) than the resulting performance may be affected in a bad way.

Marco

PS: And thank you for ordering our book!

New Post
 7/24/2009 5:08 PM
 
 Modified By kskistad  on 7/24/2009 5:08:54 PM

Thanks for the replies, I found them very helpful.  

As a side note, I recently discovered some helpful information_schema views, VIEW_TABLE_USAGE and VIEW_COLUMN_USAGE that shows which tables/columns are being used by views (they even show udf's if you are using them).  So if I make it a habit of using views as the layer of abstraction for all of my ETL, I can easily do impact analysis should the source-systems change.  I agree with your paper where you state you should only bring in the subset of tables/columns into your mirror database so it is self-describing on which entities you are using in your process, but in my case, our data warehousing team is simply replicating all the tables/columns into our staging area, so using views and the information_schema views help me to determine which tables/columns my process depends on.

Thanks again.

-Kory

 

Previous Previous
 
Next Next
ForumForumDiscussionsDiscussionsSQLBI Methodolo...SQLBI Methodolo...What belongs in the veiw, and what belongs in the ETL?What belongs in the veiw, and what belongs in the ETL?