﻿<?xml version='1.0' encoding='UTF-8'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/"><channel><title>SQLServerCentral / Data Warehousing / Strategies and Ideas </title><generator>InstantForum.NET v2.9.0</generator><description>SQLServerCentral</description><link>http://www.sqlservercentral.com/Forums/</link><webMaster>notifications@sqlservercentral.com</webMaster><lastBuildDate>Tue, 18 Jun 2013 20:42:43 GMT</lastBuildDate><ttl>20</ttl><item><title>Single DW multiple sources</title><link>http://www.sqlservercentral.com/Forums/Topic1453895-363-1.aspx</link><description>Hey all,We currently have multiple systems that we report on.  They all contain information on the same topic - but they are systems built externally by different companies.However one major goal is to be able to report on these systems together.Here is my general plan within SSIS1) Create denormalised structure for each system.2) Delete stuff for that system in unversal tables, and copy over the tables.Now there are a couple of things that have sprung to mind.Firstly i am doing step one to minimise the load on the central tables.  So it should hopefully avoid much locking if multiple SSIS packages (one for each system) are running in parrallel.  Is this the best way to handle this?Secondly - keys..... some are ints - others are varchar.  Hows the best way to split the primary keys?  I am planning to have a source column to identify the system.  So am i better off having a composite key for every table - or the other option i can think of is to append a letter to signify the system to each and every PK and FK.I would appreciate any thoughts.Dan</description><pubDate>Fri, 17 May 2013 03:03:15 GMT</pubDate><dc:creator>danielfountain</dc:creator></item><item><title>Help needed for creating the star schema</title><link>http://www.sqlservercentral.com/Forums/Topic1460093-363-1.aspx</link><description>Hi,We are planning to data warehouse the 2 different projects , to implement this we have to create the star schema from the 2 project database. Can anyone help us which approach is best to achieve this goal?Regards.Ram</description><pubDate>Wed, 05 Jun 2013 02:55:28 GMT</pubDate><dc:creator>sram24_mca</dc:creator></item><item><title>Good analysing, reporting end-user tool ??</title><link>http://www.sqlservercentral.com/Forums/Topic344752-363-1.aspx</link><description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I'm looking for a good analysing program for our end-users.&lt;/P&gt;&lt;P&gt;It needs to be able to :&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;build queries with a simple GUI(see MS Access)&lt;/LI&gt;&lt;LI&gt;build pivot reports&lt;/LI&gt;&lt;LI&gt;build regular reports&lt;/LI&gt;&lt;LI&gt;save the reports&lt;/LI&gt;&lt;LI&gt;save qeuries as templates&lt;/LI&gt;&lt;LI&gt;export the results to other formats(excel)&lt;/LI&gt;&lt;LI&gt;pass parameters with a simple GUI&lt;/LI&gt;&lt;LI&gt;easily filter the results (see excel data filter)&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;So it should be more or less the combination of MS Access and MS Excel, but only in read mode to the database.&lt;/P&gt;&lt;P&gt;So any one using such a tool please let me know.&lt;IMG height=20 src="http://www.sqlservercentral.com/forums/images/emotions/blush.gif" width=20&gt;&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;&lt;P&gt;Serge&lt;/P&gt;</description><pubDate>Wed, 14 Feb 2007 02:21:00 GMT</pubDate><dc:creator>Serge Josse</dc:creator></item><item><title>Log ship to local db for ETL source data</title><link>http://www.sqlservercentral.com/Forums/Topic1461148-363-1.aspx</link><description>I'm using log shipping to maintain a failover copy of my main db on a secondary server. I'd like to put our Analysis Services on the secondary. If we have to  failover to the secondary then Analysis Services would be turned off until we got things back in normal operation.It occurs to me that, during normal operation, I could run the nightly ETL jobs against the copy of the db on the secondary, putting everything local on that machine. It should improve performance, reduce execution time and relieve load on the primary instance, yes?Am I missing something? Any comments on this idea?</description><pubDate>Fri, 07 Jun 2013 10:02:10 GMT</pubDate><dc:creator>rray 44280</dc:creator></item><item><title>Retrieve more than 1 million rows from Cube</title><link>http://www.sqlservercentral.com/Forums/Topic1458480-363-1.aspx</link><description>Hello All,Am new to ssas and been trying to implement it for our current systemHere is what i have builtFact table: 70 million rows(potentially grows to 4 to 5 Billion)10 Dimensions(1 dimension having 5 million rows and others vary from 1-10000 rows)I have processed the cube and my results sets involving less than 10000 rows are amazingly fast.(With the huge dimension included in the MDX query)What am trying to Do: - I need to retrieve data based on this huge dimension and the result set will have close to 1 million rows. This is taking me close to 1 minute to retrieve this kind of dataMy question is: - Can the performance of the cube to retrieve data(in millions) be optimized? like less than 5 seconds - Or Is it a wrong approach by using a cube to pull this kind of data?Note:-I have also used the aggregation wizard ans UBO that brought my query time from 65 seconds to 54 seconds:-)-The MDX queries am running are from script view of the Cube Browser and other MDX designers i found onlineAny Ideas would be of great help</description><pubDate>Thu, 30 May 2013 15:55:12 GMT</pubDate><dc:creator>jondy87</dc:creator></item><item><title>DM Modeling Suggestion</title><link>http://www.sqlservercentral.com/Forums/Topic1457976-363-1.aspx</link><description>Hi everyone,I have some question regarding DM design. I am currently working on a clinical DM. Ultimate goal is to build a SSAS cube and run report from our clinical DM.Business Scenario is fairly simple: Patient can have multiple hospital visits, and for each visit, patient could have multiple transactions for lab, medicine and procedures.So far these are the dimension we have:- Patient- Procedure- Lab- Medicine- Procedure- Geography- Flag Field Junk DimensionWe currently have one fact table. Its granularity is at transaction level. Our per visit level information is also in the fact table as repeating column value.FK_Paitient / FK_Lab / FK_Medicine / FK_Procedure / 6 Lab measure attributes / 5 Medicine measure attributes (5) / 3 Procedure measure attributes (3) / 8 Visit Level AttributesEach fact line contains lab result/applied medicines/performed procedures during patient visit to our hospital. So, if patient has 7 transactions during visit, we will have 7 lines in fact tables. Since all transaction's attributes are merged, not attribute columns have some values. Some will have null values depending on the data sources (if data source is from medicine, only the medicine related attribute columns will have data and other will be filled in with null value, if data source is from lab then lab attributes will be filled in and other attributes will be filled in with null).However, since all transaction gets generated based on patient visits, all visit level attributes information are all populated and repeated 7 times. We were trying to stay in star schema, and we thought if we were to write a report from the current design DM, we would set the data scope based on patient visiting date, and show all tied transactions.Since this was our first attempt, we realized our design is flawed, and this is where we are stuck and not sure what would be best practice or solution...1. Is it okay to break fact table into 4? Instead of one giant fact table, create fact table for visit level, lab transactions, medicine transactions, procedure transactions. Each our data scope is determined based on patient visit date, so fact tables will have visit_date_key and some minimal visit level key values.2. In our analysis scenario, all of analysis is done on visit level. Lab/medicine/procedure transactions are required for report to show some detailed information. (to show MDs what lab tests are done and results for lab tests etc.) We are bring these data for drill through/down purpose only. When we roll up these values, it really means nothing. We can only derive factless(?) fact type of measure from these transaction data (i.e. total number of medicine administered during patient visit). In this situation, is it better to create visit dimension rather than visit fact, and try to derive measure from visit dimension and combine three transaction (medicine, lab, procedure) into one fact table?3. If we break the fact tables into 4. What would be the proper way to link visit fact to transaction fact tables? It's 1-M from visit to lab/medicine/procedures. Only common dimension for these 4 fact tables would be patient, and visit fact table will act like a dimension...Any suggestion would be greatly appreciated!</description><pubDate>Wed, 29 May 2013 15:45:38 GMT</pubDate><dc:creator>BlackGarlic</dc:creator></item><item><title>Customer Journey</title><link>http://www.sqlservercentral.com/Forums/Topic1455410-363-1.aspx</link><description>Hello, Has anybody had any experience with representing the customer journey (funnel/loyalty levels) as part of a dimensional model? Would love to hear from experience!Thanks.</description><pubDate>Wed, 22 May 2013 04:22:06 GMT</pubDate><dc:creator>ofridar</dc:creator></item><item><title>Dimension Tables vs Lookup Tables - the real difference?</title><link>http://www.sqlservercentral.com/Forums/Topic1456624-363-1.aspx</link><description>I'm reading about dimension and lookup tables and trying to get a better understanding between the difference. So far the best descriptor i have to separate these two tables is below. I would like to see if perhaps i'm missing something or if i'm understanding this correctly? "For each dimension used in a project, a lookup table is required to provide meaningful information. For example, a "product" dimension might be based on a lookup table that includes the productID, productName, productType, productManufacturer, and so on. Dimensions define a hierarchy for the information retrieved by the lookup table, so that it can be organized and presented in a meaningful way."Example:Dimension tableVW -&amp;gt; Golf -&amp;gt; Features VW -&amp;gt; Jetta -&amp;gt; Features Honda -&amp;gt; Accord -&amp;gt; FeaturesLookup tableFeatures 1 -&amp;gt; xxx2 -&amp;gt; yyy3 -&amp;gt; zzzmanufacturervw -&amp;gt;aaahonda -&amp;gt;bbbcargolf -&amp;gt;cccjetta -&amp;gt;dddaccord -&amp;gt;eeeetc</description><pubDate>Fri, 24 May 2013 11:01:10 GMT</pubDate><dc:creator>cs_source</dc:creator></item><item><title>SSIS package for DWH Load</title><link>http://www.sqlservercentral.com/Forums/Topic1418989-363-1.aspx</link><description>Hi All,Please suggest the best option for designing SSIS package for DWH load.Here is the scenario.I have two star schema model with total 8 Dimension and 4 fact tables.I thought of creating three packages.First package will select data from systems(there are five different src systems) to staging with required transformations.Second Package  will load data to dimension tables.Third Package will load data to fact tables.Please suggest this is the best design to go with or is there any better design option you can suggest.ThanksSam</description><pubDate>Tue, 12 Feb 2013 08:05:04 GMT</pubDate><dc:creator>sam 55243</dc:creator></item><item><title>High level architecture  -- consolidating data from multiple servers</title><link>http://www.sqlservercentral.com/Forums/Topic1438939-363-1.aspx</link><description>I'm looking for some input regarding how to architect a process for ongoing consolidation of data from multiple servers.Here's the background.We have (currently) 3 separate, but structurally identical, MMSQL database servers (Standard edition) which handle a moderate amount of data (about a million rows per day each). Each server runs about 12 separate databases (again, identical setup across all 3) with about 320 tables that service the various functions our company provides to our clients. There were a variety of (mostly good) reasons for splitting up our customer base across 3 separate groups, mostly having to do with data isolation as well as simply not putting all of our eggs in one basket. It is imperative that each of these three servers be VERY responsive to transactional queries made by our various applications and services. We want to keep the load on each of those servers very light.The need at this time is to begin building out a data warehouse that will allow us to do analytics across all of our clients across all three of these groups. The first step we are taking is to, quite simply, aggregate the data from all three groups into a single, new set of the same 12 databases that will then serve as a set of staging databases from which to build out the warehouse. We are NOT envisioning any transformation or data cleansing at this point -- just aggregation.Nearly all of the tables in all of the databases use GUIDs as primary keys; exceptions are lookup tables that are identical across all three groups anyway. So we don't have to be concerned about surrogate keys at this point. The goal is to simply mix the data from all three groups into a single, large SQL instance. All of the tables have CreatedDate and UpdatedDate columns, but the values in those columns are actually set by .NET code in multiple application servers (not the database server) and there is no guarantee that the system clocks are precisely synchronized. Therefore, I'm thinking of using Change Data Tracking (NOT Change Data Capture, which is only available in the Enterprise version) to look for Insert/Update/Delete notification.A side benefit of this aggregation will be to allow us to regularly "prune" data out of the production SQL instances (data over a year old, for example) without actually losing it (because it will be retained indefinitely in this larger, aggregated instance). Therefore, deletion of data in the 3 production instances (which would only happen when pruning, since our processes only perform a logical delete, not a physical delete, of data) should NOT cause data to be deleted in the aggregated instance.We envision keeping the aggregated, staging server fairly current -- latency of not more than 12 hours in the data. So we're looking to run some process on a 12 hour (or shorter) schedule.If you're still reading, thank you!Here's what I've thought of so far:1) SSIS -- I've worked out a design for an SSIS package that could run against all three servers. However, since each table is different, I'm looking at 320 separate data flows that would have to be created using the SSIS GUI/designer, with multiple steps in each. Not sure I have to patience/guts for that. Plus, maintenance could be a nightmare.2) Replication -- We're already using transactional replication within each of these three 'groups' to keep a copy of the production data available on a second server as a 'warm' failover as well as for reporting. I've looked into simply creating a subscription on the aggregating server to each of the three existing publications. I understand this is possible, but there appear to be a great many issues surrounding initializing the subscriptions as well has preventing deletes (pruning) on the production side to NOT cause a delete on the aggregation server. In other words, a potential minefield?3) SQL queries/stored procedures -- I have created a template for using Change Data Tracking in a stored proc to extract changed data. I've considered using a code generation tool to take that template and programmatically create the 320+ stored procs that would be put on each of the 3 production servers and then executed by a 'master' process on the aggregating server (via a linked server connection). This seems the most promising at this point, especially since change to database schema that could occur down the road could be handled by re-generating the relevant SP's and deploying along with the schema changes.Again, thanks for reading this far. Anyone have any feedback on any of the three architectures mentioned above, or have a completely new, better path to send me down?Thanks for your input!</description><pubDate>Thu, 04 Apr 2013 11:42:38 GMT</pubDate><dc:creator>Rob Schripsema</dc:creator></item><item><title>creating an investment data warehouse</title><link>http://www.sqlservercentral.com/Forums/Topic1452294-363-1.aspx</link><description>Looking for some help designing a small prototype for investments.  Ideally, what I'm trying to show is a position of a portfolio as it changes over time.  A position is simply what value is attached to a portfolio at a given point in time, and a portfolio is the securities (stocks, bonds, etc) that are included in the measurement.So if I have 5 shares of IBM that are worth 100$, 10 shares of Microsoft worth 200$, my position right now is $300.I'm wondering how to track this.  Do I have a fact table to track each individual investment, with a DimInvestment to tell me more about it, and a DimDate to allow me to slice and dice by date/time.  If I did this, how would I track the positions?  I was thinking I would have a DimPosition table that receives an entry each time the position changes as well as a new set of rows for my Fact table.Am I headed in the right direction?</description><pubDate>Mon, 13 May 2013 13:33:10 GMT</pubDate><dc:creator>Matthew Cushing</dc:creator></item><item><title>Strategies for dealing with Excel</title><link>http://www.sqlservercentral.com/Forums/Topic1450421-363-1.aspx</link><description>Like a lot of organisations, we struggle with the fact that a lot of users manage important business data in excel spreadsheets.  Wherever possible, I've been encouraging them to move primary data out of excel and into our corporate systems... leaving excel for just analysis.  Once the data is being managed in a corporate system, I can then offer them automated reports... which generally makes everybody happy :)However, there are plenty of examples that I come across where existing corporate systems simply don't have the capacity or capability of storing the data we've been capturing in excel.  Nor is there any real cost-benefit in implementing a new system.For example... we have a small team that handles customer complaints.  They currently manage complaints data in an excel spreadsheet, and asked if my team could help automate their reporting.  Most of our customer data is already stored in our ERP, but we don't have a CRM so there is nowhere to really store complaints information.So what do we do?* Continue to let them maintain their data in spreadsheets, but use SSIS to suck data into the warehouse on a regular basis* Build some kind of simple bespoke application using Access (ugh) or web forms, with a SQL Server backend* Use some kind of Sharepoint functionality (unfortunately, not currently implemented around here)* ?????</description><pubDate>Wed, 08 May 2013 00:40:46 GMT</pubDate><dc:creator>Tyberious Funk</dc:creator></item><item><title>Designing a Dim table for SSAS and SSRS</title><link>http://www.sqlservercentral.com/Forums/Topic1449814-363-1.aspx</link><description>New to SSAS.  I'm working on my first cube.  I'm creating a dim table for the accounts.  The level of detail that we would use for SSAS is the accounts county, but for SSRS we would want to be able to show the street address.  We want to be able to use the dim tables for reporting out of SSRS as well as the cube.  So if I add the street address to the Dim table, but I do not use it as an attribute in SSAS, is that considered a No-No in the SSAS world?  Is there a major negative impact on the Cube if I do include the data in the table?Thanks</description><pubDate>Mon, 06 May 2013 11:28:45 GMT</pubDate><dc:creator>adams.squared</dc:creator></item><item><title>Data Warehouse Models</title><link>http://www.sqlservercentral.com/Forums/Topic1442168-363-1.aspx</link><description>Someone from my company's strategy has said that the Data Warehouse model needs to be:- Atomic- 3rd NF- RelationalBut I disagree. I always considered DW models to be:- Dimensional- Data has to be very granular (represents 3NF atomic data)- Subject OrientedBut the Strategist from the team came back to me and commented:[quote]dimensional layer is for user access. What if user doesn't need to see the the data structures, eg if using a dashboard. Should we build a dimensional model? If so what advantages does it bring? Could just build a single layer? The atomic? The dimensional ? What are the implications if the dashboard includes data from both the DW and a OLTP application?[/quote]Can someone assist me with answering these questionsThanks</description><pubDate>Sun, 14 Apr 2013 23:31:06 GMT</pubDate><dc:creator>rka</dc:creator></item><item><title>Geography Dimension</title><link>http://www.sqlservercentral.com/Forums/Topic1074570-363-1.aspx</link><description>I am currently working on a warehouse project that includes a dimension for geography.  It includes Zip Code, City and State currently.  Latitude and Longitude are available as well in the feed that populates the dimension.  My question is should the lat/long be placed on the dimension or fact?  I am leaning towards the dimension but am getting pushback as others are saying it is too narrow for the dimension (will cause too many rows on the dimension) and it should be placed on the fact.Anyone else have this debate before?</description><pubDate>Mon, 07 Mar 2011 19:37:20 GMT</pubDate><dc:creator>Dave Klug</dc:creator></item><item><title>help to design a Cube that shows me all purchases for those who bought a given product</title><link>http://www.sqlservercentral.com/Forums/Topic1446367-363-1.aspx</link><description>Hi,I'm stuck, so stuck I'm not even sure what to google or put in the subject, so sorry for that.I've got a requirement which is simple in TSQL but can't figure out how to do it in SSAS, basically show me the purchase history for all contacts who have bought a specific product.Simplified I have a factTable of ContactID,ProductID,Quantity,Value  and dimensions of dimContact, dimProductMy train of thought is I need to filter dimContact based on values in factTable but not sure how.I'm sure I just need to be pointed in the right direction to get me out of my thought rut.Thanks and apologies if this is a dumb post.Giles</description><pubDate>Thu, 25 Apr 2013 03:26:16 GMT</pubDate><dc:creator>giles.clapham</dc:creator></item><item><title>Model Transaction Detail For SSAS</title><link>http://www.sqlservercentral.com/Forums/Topic1441031-363-1.aspx</link><description>[b]Dimension_Customer[u][/u][/b]	CustomerId	Name[b]Dimension_Date[u][/u][/b]	DateId	Date	Month[b]Fact_Sale[u][/u][/b]	CustomerId	DateId	TotalSaleLet’s assume the application that saved the sale accepts a comment from the user (free-form and nothing we want to aggregate by).  I would like to save this in the data warehouse so we can use Excel to show the comment for each sale.  Here is an example of how I would like to see the data in Excel.  If this isn’t possible without a dimension then what about “right click” show details?  What are the best options to model this?[u]January[/u]        TestUser1     $100.00                               *Drill down to comments*TestUser2  $200.00TestUser3  $300.00</description><pubDate>Wed, 10 Apr 2013 14:47:25 GMT</pubDate><dc:creator>SQL Dude-467553</dc:creator></item><item><title>Fact/Dimension design suggestions</title><link>http://www.sqlservercentral.com/Forums/Topic1404470-363-1.aspx</link><description>I'm from Higher Education and I'm working on a warehouse so our recruitment folks can compare application counts of certain statuses between today and a year ago today.  The data I'm working with from our ERP has a table that holds application records - there is a record for every application and what term they are starting in, among other things.  There is a related table that holds the status associated with that application along with the date that status was made active.  I need to display a nice, neat, easy to read graph showing the counts between the two dates but here's the kicker - for both dates I only want to display start terms that are/were being recruited for on that date.Any suggestions on building my DW database?</description><pubDate>Tue, 08 Jan 2013 15:32:36 GMT</pubDate><dc:creator>wshelton 51985</dc:creator></item><item><title>Data Warehouse Development Process</title><link>http://www.sqlservercentral.com/Forums/Topic1439553-363-1.aspx</link><description>Hi, I am trying to create a simple yet robust development lifecycle for my organization. I do not want to create unnecessary artifacts and steps.Has anyone created a simple yet effective lifecycle with the corresponding artifacts?I have thought of something like this:[code="plain"][b]Pre-requisites[/b]------------------------ ETL Framework &amp; Templates- ETL Standards- DW Standards (SQL, Oracle etc)- Source Code Control Standards[b]Requirements Gathering[/b]------------------------ Data Requirements- Sources of Data- Subject Area/Business Processes[b]Requirements Analysis and Design[/b]---------------------------------- Bus Matrix- Test Scenarios- Data Contract (Establish Interface requirement)- Business Rules- Transformation Rules- Solution Design[b]Data Modelling [/b]------------------------ Conceptual Model- Logical Model- Physical Model- Detail Data Design- Data Dictionary[b]ETL[/b]------------------------- ETL Specification- Source to Target Mapping[b]Development[/b]-------------------------- DW Objects- ETL Packages- Load Data (Execute Packages)- Unit Test- Code Walkthrough (Peer Reviews)- Scheduing (Enterprise Scheduling Tool)[b]Testing[/b]------------------------- UAT- Test Cases- User Guide[b]Implementation[/b]-------------------------- Release Register- Source Safe- Implementation Manual[/code]</description><pubDate>Sat, 06 Apr 2013 02:45:53 GMT</pubDate><dc:creator>rka</dc:creator></item><item><title>SQL Server 2012 VLDB Data Warehouse Design Question</title><link>http://www.sqlservercentral.com/Forums/Topic1397028-363-1.aspx</link><description>Appreciate feedback to this question.  I'm working on a design for VLDB that will be in the several TB size range.  The load activity will be vary infrequently as the data will be for research purposes only.  People will not have to make business critical decision with this data nor will there be any OLTP activity against the data.  In fact data queries will be mostly in the form of an export which also will not happen daily. Frequency of the export will most likely be along the lines of bi-monthly or quarterly.  That said, I'm having a hard time to justify the use of partitioning and even file-groups seem like a stretch in term of producing any real performance benefits.  The Cost/Benefit (C/B) breakdown doesn't appear to offer much of a compiling advantage.    Yes, the data repository will be large.  But I 'm thinking with the right performing storage, CPU and Memory, the export time-to-completion should be fairly reasonable without the use of these strategies.  Am I missing something in my thought process?  Given these requirements should I consider the use of partition or filegroups?</description><pubDate>Sun, 16 Dec 2012 16:49:45 GMT</pubDate><dc:creator>Anthony Stephens</dc:creator></item><item><title>Experiences using Wherescape Red on SQL Server?</title><link>http://www.sqlservercentral.com/Forums/Topic1424030-363-1.aspx</link><description>I am struggling to find much content out online so I was wondering if anyone wanted to share their experiences of using Wherescape Red on SQL Server?I would like to know what change management procedures you use to support the DW development, best practices you follow and how you guys generally find the product to work with on a day to day basis.</description><pubDate>Tue, 26 Feb 2013 05:47:34 GMT</pubDate><dc:creator>aaa-322853</dc:creator></item><item><title>sql12 tabular mode design</title><link>http://www.sqlservercentral.com/Forums/Topic1410852-363-1.aspx</link><description>Hello, Im finding it a little "extra" work is needed to get a complicated tabular model designed vs traditional OLAP cubes and it also seems more clunky to work with. For example renaming fact/dims tables/columns there is no auto-refresh of the data source view, you have to go to each table properties and refresh there and then modify your affected formulas. Having say 10 facts and 20 dims there are a lot of "inactive" relationships created by the designer which results in custom measures to be created for each of these, why did they do this? E.g. CountofInactiveRelationField:=CALCULATE(COUNT(Table[column]),USERELATIONSHIP(Table[column] ,LookupTable[LookupColumn])).  If we dont do this then the counts are not correct for this inactive field.  Traditionally to get a simple count rolled up for each dimension there is on standard count measure and the rollup happens automatically in OLAP cube. This happens to be a pain where there are 10+ formulas for each of the facts!Is this the design experience developers are having out there? or is there some workarounds for this? It seems memory/compression speeds for data loading/retrieve is offset by bad design principles needed to be applied to get it.</description><pubDate>Wed, 23 Jan 2013 17:43:21 GMT</pubDate><dc:creator>skeezwiz</dc:creator></item><item><title>Data flow mapping - ETL</title><link>http://www.sqlservercentral.com/Forums/Topic1423884-363-1.aspx</link><description>Hi,I was wondering what do you guys use to map the data flow in ETL. I am looking at the field level flow of data. By this, I mean a document that can be passed on to the ETL developers which maps source table.column to destination table.column and the transformation logic between source and destination. This is also handy on the reporting side to map out the fields and logic in report.  Currently we use a mix of excel and word to map this. I was wondering whether there is a better way/tool to do this? Thanks in advance :-)</description><pubDate>Mon, 25 Feb 2013 23:09:14 GMT</pubDate><dc:creator>chess-288252</dc:creator></item><item><title>Connecting to CRM 2011 Web Service using SSIS 2008</title><link>http://www.sqlservercentral.com/Forums/Topic1252777-363-1.aspx</link><description>I would like to backup MS Dynamics CRM 2011 Online using SSIS. My current code allows me to download known entities and known fields one record at a time. The issues I have to solve include..1. How do I obtain a list of all entities2. How do I obtain a list of all fields within each entity3. How can I download - lets say 500 records at a time.I do not want to use expensive ETL tools. I prefer to use SSIS.</description><pubDate>Wed, 15 Feb 2012 15:11:50 GMT</pubDate><dc:creator>pukkauk</dc:creator></item><item><title>How often do you back-up your data warehouse?</title><link>http://www.sqlservercentral.com/Forums/Topic1418618-363-1.aspx</link><description>Hi All,I'm trying to find a happy medium to a back-up plan for an up-and-coming data warehouse.  The DB won't be huge, as in TB's.  Max size is unknown but I don't see anything above 100 gig for quite some time.  I'll import the daily changes overnight since we work 7 days a week.My SLA on data availability (via SSRS) has not been nailed down but I think we'll offer M-F 8-5 support with no more than 1 business day for a full restore.With this said, could anyone offer suggestions on a back-up plan?  I was thinking once a week but of course I'm thinking a restore would be needed 6 days in to the current cycle and I'd lose a lot of data and have to manually import that data. Thanks,Mark</description><pubDate>Mon, 11 Feb 2013 13:12:56 GMT</pubDate><dc:creator>Mark Eckeard</dc:creator></item><item><title>Business datasets (overview, structure, naming conventions, business rules)</title><link>http://www.sqlservercentral.com/Forums/Topic1412313-363-1.aspx</link><description>Hi GuysI was wondering if someone could point me in the right direction.I need to gather information about different datasets used in different business areas: banking, insurances, online betting, sales, CRM ETC.So in summary: Is there anything that I could read to understand how other businesses (I’m currently in healthcare) do their business intelligence and analytics. How does the business works? How do their facts and dimensions look like? What kind of business rules do they apply?Do you know anything that I could read: books, articles, papers?It would be nice to get top level information first so I would prefer few hundred pages rather than few thousands for starters.Thanks for your help in advanceMike</description><pubDate>Mon, 28 Jan 2013 03:55:15 GMT</pubDate><dc:creator>michal.giczewski</dc:creator></item><item><title>Data modeling customer part numbers in data warehouse</title><link>http://www.sqlservercentral.com/Forums/Topic1394124-363-1.aspx</link><description>In our business system we have our customer part numbers stored in a table with its primary keys as: our part number, customer part number, customer location and customer number. I have a customer and a part number dimension already. Would I just create a dimension with customer and part number foreign keys in them or should I create a separate dimension that have some of the customer and part number information in that dimension and the attributes in the business system table. Some of the reports they are looking for are: What part numbers are loaded for a customer and Sales by customer part number. Has anyone modeled similar data before? How did you accomplish this?Thanks in advanced.</description><pubDate>Fri, 07 Dec 2012 09:27:34 GMT</pubDate><dc:creator>MikeBrey</dc:creator></item><item><title>Error in setting conditional split values??</title><link>http://www.sqlservercentral.com/Forums/Topic1400865-363-1.aspx</link><description>i set :order            : 1Output name  :  Case 1Condition       : [EmpID] !=0i got error :===================================Error at Data Flow Task [Conditional Split [30]]: The data types "DT_WSTR" and "DT_I4" are incompatible for binary operator "!=". The operand types could not be implicitly cast into compatible types for the operation. To perform this operation, one or both operands need to be explicitly cast with a cast operator.Error at Data Flow Task [Conditional Split [30]]: Attempt to set the result type of binary operation "EmpID != 0" failed with error code 0xC0047080.Error at Data Flow Task [Conditional Split [30]]: Computing the expression " [EmpID] !=0" failed with error code 0xC0047084. The expression may have errors, such as divide by zero, that cannot be detected at parse time, or there may be an out-of-memory error.Error at Data Flow Task [Conditional Split [30]]: The expression " [EmpID] !=0" on "output "Case 1" (79)" is not valid.Error at Data Flow Task [Conditional Split [30]]: Failed to set property "Expression" on "output "Case 1" (79)". (Microsoft Visual Studio)===================================Exception from HRESULT: 0xC0204006 (Microsoft.SqlServer.DTSPipelineWrap)------------------------------Program Location:   at Microsoft.SqlServer.Dts.Pipeline.Wrapper.CManagedComponentWrapperClass.SetOutputProperty(Int32 lOutputID, String PropertyName, Object vValue)   at Microsoft.DataTransformationServices.Design.DtsSplitterComponentUI.SetOutputAt(Int32 index, ColumnInfo outputInfo, String expression)   at Microsoft.DataTransformationServices.Design.DtsSplitterComponentUI.SaveExpressions(ColumnInfo[] outputs, String[] expressions, ColumnInfo elseBranchName)   at Microsoft.DataTransformationServices.Design.DtsSplitterFrameForm.SaveAll()</description><pubDate>Fri, 28 Dec 2012 04:17:46 GMT</pubDate><dc:creator>sumith1andonly1</dc:creator></item><item><title>Employee Dimension/Weekly Sales Fact</title><link>http://www.sqlservercentral.com/Forums/Topic1397919-363-1.aspx</link><description>In our star schema, we have an employee dimension.  That dimension includes the employee, the store where the employee works, and the organizational hierarchy above the employee.  We also have a Weekly Sales fact table.  There is a PK/FK relationship between the dim table and the fact table.  That relationship has referential integrity enforcement.  Here is the problem.  Our Weekly Sales fact table also includes sales from our external partners.  Because those external trading partners are NOT employees, their sales violate referential integrity.  What should I do?</description><pubDate>Tue, 18 Dec 2012 10:33:29 GMT</pubDate><dc:creator>imani_technology</dc:creator></item><item><title>Item/Foreign Currency Question</title><link>http://www.sqlservercentral.com/Forums/Topic1394727-363-1.aspx</link><description>Each item has a currency (USD, AUD, CAD, etc.) and a currency amount.  Each currency amount has a starting effective date and an ending effective date.    Right now currency, currency amount, currency from date and currency to date are all in the Item dimension.Here’s the question:  should the currency amount actually be a measure?  Should it be part of our sales measure group/fact tables?  If so, how?</description><pubDate>Mon, 10 Dec 2012 11:48:37 GMT</pubDate><dc:creator>imani_technology</dc:creator></item><item><title>[Error loading mining model metadata: No mining models were found.]</title><link>http://www.sqlservercentral.com/Forums/Topic1396742-363-1.aspx</link><description>I will be involved in existing data warehouse project in my company.I am starting to learn about data warehouse - my primary role is dba.When I start to browse the project, I have message in the middle window (SQL management Studio):“[Error loading mining model metadata: No mining models were found.]”What does that mean - I thought this is necessary part of the warehouse?In addition: best first book/article on the net?Thanks,Brano</description><pubDate>Fri, 14 Dec 2012 10:16:15 GMT</pubDate><dc:creator>branovuk</dc:creator></item><item><title>Modeling Survey Data</title><link>http://www.sqlservercentral.com/Forums/Topic1394107-363-1.aspx</link><description>I work for a distributor and biannually we send out a survey to our employees to gauge how well our vendors are doing. After the employees take the survey we compile a report and share it with the vendor. Right now it is an access database and a crystal report. I am looking to get it into SSRS but am having difficulties determining how to model the data. This is what I have so far:Dim_VendorDim_Question (survey questions have set values and comments)Dim_EmployeeDim_Date (start-end date of survey)Fact_SurveyDo I have two fact tables one for the comments and the other for the set values? The Dim_Question table seems a bit light. All I can think of putting in it is the question, category of question and comment y/n. Another part of this vendor process is that we give them a report on total sales we have for their product lines. Any help is appreciated. Thanks.</description><pubDate>Fri, 07 Dec 2012 09:03:53 GMT</pubDate><dc:creator>MikeBrey</dc:creator></item><item><title>SSIS using Stored Procedures to do incremental loads into DW</title><link>http://www.sqlservercentral.com/Forums/Topic1359550-363-1.aspx</link><description>Is it insane to have SSIS to use a Stored Procedures to do incremental loads into DW, if the frequency is like every 5 minutes?Here is the bacground. We use SSRS to report off of our production server via stored procs. We are now thinking of creating a replicated server to which we will point the report into. However, we also have a need to give the users access to report data (stored procs), so they can link them with other data sets as they need to using MS Query in excel. Problem is that users won't be able to link data sets if they are based off of stored procs. Have already tried encapsulating SPs inside of Functions but that also does not seem to work.What my boss wants: - Forget the replicated server. - Create a DW which will hold the report data sets in a table per report, to be populated by the existing report stored procedures via SSIS which will be updated every 5 minutes. Some of our SPs are highly complex and have aggregated result sets.- Have all the reports point to their corresponding tables in the DW- Have the users access these tables so they can link the dataset as they need to.JC</description><pubDate>Fri, 14 Sep 2012 12:22:45 GMT</pubDate><dc:creator>jtc900502</dc:creator></item><item><title>Building SSAS Cubes - Beginner</title><link>http://www.sqlservercentral.com/Forums/Topic1386819-363-1.aspx</link><description>Hi,Is it mandatory to have a data warehouse in dimensional model to build a SSAS cubes above?I'm trying to see if there is any scope to address few reporting needs without the need to build a traditional Data Warehouse first and then build cubes on top of the data warehouse, instead build cubes to do Financial monthly aggregated reporting needs sourcing from backup copy of the Transaction systems.Appreciate, if somebody can point me to right direction, I'm still learning and hence this silly query.PS: I have posted same query in Analysis Services Forum assuming this query should have been posted here and not there. Is this okay? Regards,KK</description><pubDate>Tue, 20 Nov 2012 04:24:11 GMT</pubDate><dc:creator>kamal_ece</dc:creator></item><item><title>Datawarehouse - My PO keeps asking for more tempdb</title><link>http://www.sqlservercentral.com/Forums/Topic1385251-363-1.aspx</link><description>I think we have all been in this situation before.  Devs fill TEMPDB then say they need more TEMPDB.Here is my setup: I have a 4 node DW each with about a 1-1.5TB database living on them.  My largest table is about 850GB in each.  (1,440,695,540 rows, 353,312,408KB data, 510,599,720KB index (lol yea)).My tempdb is setup like this:8 5GB data files (so 40GB total tempdb data file space)1 10GB log fileAs a dba here is what I have always maintained: if your query takes more than 10GB of tempdb to run, the query and/or design is flawed.  in this case I have given them 40GB data and 10GB log and yet they still manage to fill that from time to time with queries that run for hours.  yes, the fill can be a stats issue.  In either situation, imo, the solution is NOT to add more tempdb, but to fix the underlying problem w/ the query/stats.So really discuss this and help chime in with your feedback on whether you think my logic is correct in that my file sizes are adequate and you should fix the root cause, or I should give in and give more tempdb space for what I viewed as flawed logic.  </description><pubDate>Thu, 15 Nov 2012 09:43:29 GMT</pubDate><dc:creator>Brian-444611</dc:creator></item><item><title>Show Right Column to Right User</title><link>http://www.sqlservercentral.com/Forums/Topic1340374-363-1.aspx</link><description>If I have three different user with different occupation (manager, salesman, accounting)The main question is to how display right column to right person based on star schema and requirement below in SQL server?The fact and dim are using regular table inside of data mart.Background information:    The manager is authorized to see all column in factTransaction    The salesman is not allowed to see TaxAmount, TotalAmount and ProductBusinessKey.    The Accounting is note allowed to see Product Quantity, ProductPrice and GeographyFullname.In windows, the they have their own user account.The picture is take from the address (http://stackoverflow.com/questions/3308647/design-of-a-data-warehouse-with-more-than-one-fact-tables)[img]http://i.stack.imgur.com/DlnsI.png[/img]</description><pubDate>Mon, 06 Aug 2012 00:42:46 GMT</pubDate><dc:creator>akirajt</dc:creator></item><item><title>Unknown dimension values or</title><link>http://www.sqlservercentral.com/Forums/Topic1380754-363-1.aspx</link><description>In your DW do you always create an "-1 Unknown" record for each dimension?  For example in DimEmployee:[code="plain"]EmployeeKey -1EmployeeName "Unknown"ect.[/code]Or do you just rely on setting SSAS's UnknownMember and UnknownMemberName properties?[url]http://technet.microsoft.com/en-US/library/ms170707%28v=sql.105%29.aspx[/url]Just curious what other folks do.Thanks,Rob</description><pubDate>Sat, 03 Nov 2012 17:44:31 GMT</pubDate><dc:creator>robert.gerald.taylor</dc:creator></item><item><title>Fact and dimension structior for Ticket tracking</title><link>http://www.sqlservercentral.com/Forums/Topic1385394-363-1.aspx</link><description>Fact and dimensional structure for Ticket tracking  (Sorry can't edit the topic)Hi,I am new to Dimensional modeling and SSAS reporting.  I am working on creating a cube but first I have to create the fact and dim tables.I have to keep track of the status of tickets, dates they were opened and closed, operator that worked on the ticket, count of the tickets, the duration in days the ticket was open before it closed, and the # of tickets per customer.TablesFactTicket: Count for ticket, duration of ticket, status of ticket, DimDateDimCustomerDimOperator: Operator_id, type, ticketcodePlease let me know if I should add any more dimtables.  Any suggestions or advice is appreciated.  Thanks.</description><pubDate>Thu, 15 Nov 2012 16:14:09 GMT</pubDate><dc:creator>agnetha_11</dc:creator></item><item><title>Does it make sense store measures in a dimension table?</title><link>http://www.sqlservercentral.com/Forums/Topic1379230-363-1.aspx</link><description>Hi everybody:Right now I´m working in a BI solution and I received a set of tables with a multidimensional structure. I have a Fact table with transaction from customers and several dimensional tables. One of them is the KUNDE or customer table, in which the typical customer information is stored, like customer key, a key to another table (contact details), customer since date, etc. I also found numeric values in this table like "amount of purchases last 12 Month" or "Total shopping days" inside this table. That makes me wonder if it makes sense or not to store this measures inside the table, moreover, I'm building another table to store the customer data as a type II SCD and I´m not able to persuade the analyst to not include measures in this table. What is your opinion about that? Any comment would be appreciated.Kind Regards,</description><pubDate>Wed, 31 Oct 2012 04:29:14 GMT</pubDate><dc:creator>Paul Hernández</dc:creator></item><item><title>Combining businesses with people in one dimension</title><link>http://www.sqlservercentral.com/Forums/Topic1381657-363-1.aspx</link><description>In the application data that I wish to transform into a data warehouse, people and businesses are basically treated in identical ways. In the OLTP database the core data about both types of entities are stored in the same table, with the "LastName" column doing double duty as the "BusinessName" column as well. There is an indicator column to distinguish individuals from business, but there really is very little else to distinguish them.In the data warehouse, should businesses and people be commingled in the same dimension? If so, should the "LastName" column still do double duty, or should there also be a separate "BusinessName" column? Filters on the entities' names would be complicated by such separation, which is one reason that a single column is being used for both types of names in the OLTP database.If the business data and people data are in separate dimensions, wouldn't that complicate the fact table structure? In such a case there would be two separate dimensions instead of one to reference for the "who did it" data for transactions.Any help or insight is welcome.Thanks in advance,Geoff</description><pubDate>Tue, 06 Nov 2012 09:59:28 GMT</pubDate><dc:creator>geoff5</dc:creator></item></channel></rss>