Analysis :: Hierarchy Based On Dimension Table Joined Multiple Times Against A Fact Table?
Aug 11, 2015
I am working on a model where I have a sales fact table. Each fact record has four different customer fields (ship- to, sold-to, payer, and bill-to customer). I have one customer dimension table that joins to the sales fact table four times (once for each of the customer fields above). When viewing the data in Excel, I would like to have four hierarchies (ship -to, sold-to, payer, and bill-to customer) within Customer.
Is there a way to build hierarchies within my Customer dimension based on the same Customer table? What I want is to view the data in Excel and see the Customer dimension. Within Customer, I want four hierarchies.
As u can see there is two company references in my fact table, and the schema is in snowflake. My customer requirements state that the Contracts' amounts can be aggregated/filtered for/by, ServiceProviderCompany, its city/profession or ClientCompay, its city/profession.
First thing came in to my mind is to dublicate whole dimension structure (one for serviceproviders, one for clients), which i thought that there should be another way around?
I am modelling cube in SSAS. Cube has around 20 dimensions and 6 fact tables. Some of the dimensions are common among the fact tables. e.g. Time dimension. Fact_PNL has 3 date columns for those we have 3 role playing dimensions in the dimension usages.
Another fact table has 5 date columns for them as well we have separate role playing dimensions in dimension usage tab. We have a common dimension Company which is foreign key in all fact tables. We might need to combine the data from multiple facts to get final output.
Should i create 6 role playing dimension for each of the fact table or use the same dimension for all fact tables?
Role playing dimensions should be created when we have multiple columns pointing to the same dimension ?
I am modelling two fact tables of Actuals and Budget which are at different granularity, Actuals are at day, customer and product sub category level. Budgets are at month, Region and Product category level.
Month, Region and Product Category is present in Date, Region and Product Category dimension respectively. I have only three dimensions as Customer, Product and Date. Linking those dimensions to Actual Fact table is not an issue, what is the best way and options are there to link budget fact table to those three dimensions.
Say you have a fact table with a few columns that all reference the same key column in a dimension table, you want to write a view to return the information for those keys?
USE MyTestDB; GO SET NOCOUNT ON; IF OBJECT_ID ('dbo.FactTemp' ,'U') IS NOT NULL DROP TABLE dbo.FactTemp;
[Code] ....
I'm using very small data at the moment, and the query plan and statistics don't really say which way.
I'm loading a fact table that has several geographic attributes - some are at the state level, some are at the county level, and then some are drilled farther in that that. I understand the basic concept of the dimension with the ragged hierarchy, but unsure of how to load to the fact table using lookups based on these geographic units. For example, if my geographic dimension contains 200 records for the state of Wyoming, basically a record for each fine-grain place (i.e. city/town), then how do I go about doing a county lookup. Wyoming only has 23 counties, but because of the repetitive nature of the dimension attributes that are not at the finest grain, I'll get more records in the lookup than I need. This activity repeats of course while I move up the geographic scale to state, then country. How do I configure/fill my dimension to handle these differing scales of data?
we have a problem with "one-to-many relations between fact table and dimension table". Take the example of table "LOGGEDFLAW" which is related one-to-many to the table "LOGGEDREASON. "LOGGEDFLAW" includes the column "FLAWKEY" and "LOGGEDREASON" includes the column "REASONKEY" and essentiallay the column "FLAWKEY" as foreign key. Now assume that we have the following records in there:
Now assume, that "LOGGEDFLAW" is the facttable and "FLAWCOUNT" is the measure with the source column "FLAWKEY" in which we want to count the number of FLAWs. As you see in the example the number of FLAWs is 1 for "FLAW1" and "FLAW2". Microsoft Analysis Server generates the value of 2 for the number of FLAWs "FLAW1" because of the one-to-many relationship to the table "LOGGEDREASON". In the attached ZIP File you find :
- a MDB File with the described example - a screenshot from the cube constructed in AS - a screenshot from the result table generated with AS.
The question: How is it possible to calculate the measure "FLAWCOUNT" correctly, ignoring the records generated by the one-to-many relationship?
In my data modell I have defined the 2 tables "Person" and "Category":
Table "Person" ---------------- [PersonID] [int] IDENTITY(1,1) NOT NULL [CategoryID] [int] NOT NULL [FirstName] [nvarchar](50) [LastName] [nvarchar](50)
Table "Category" ---------------- [CategoryID] [int] IDENTITY(1,1) NOT NULL [CategoryName] [nvarchar](50)
Now I like to read my first row from the source and lookup a value for the CategoryID "sailing". As my data tables are empty right now, the lookup is not able to read a value for "sailing". Now I like to insert a new row in the table "Category" for the value "sailing" and receive the new "CategoryID" to insert my values in the table "Person" INCLUDING the new "CategoryID".
I think this is a normal way of reading data from a source and performing some lookups. In my "real world" scenario I have to lookup about 20 foreign keys before I'm able to insert the row read from the flat file source.
I really can't belief that this is a "special" case and I also can't belief that there is no easy and simple way to solve this with SSIS. Ok, the solution from Thomas is working but it is a very complex solution for this small problem. So, any help would be appreciated...
I have a large flat file that comes to me. I first import the flat data in to a SQL table for ease of use. Then i put it into a more permanent table with the proper references to dimension tables. I want to build a dimension table out of information from my flat file. I have a dimension table with columns, [Org Client], and [Client#] where [org client] is the name of the client. Both of these columns appear in my flat file but i want to use only the client# in my permanent table. How extract distinct values of client # and [org client] into a dimension table?
My idea was to select distinct values of client# and use some type of foreach loop to go through each client# and use a query to select the TOP(1) values of [org client] where client# = x. Would this work and if so how do I go about setting this up?
I'm really hoping there is a simpler way than this. Thank you all for your time.
Now I create datawarehouse for my client, I have SSIS a lot for ETL process, I a problem that some fact table need to be updatetable and there is a lot of data of this, I need some efficent way to load this data to data warehouse. I have read your article about SCD in SSIS (Slowly Changing Dimensions in SQL Server 2005). I think the purpose of SCD for Dimension table. If I have some fact table that need rows to be updatetable can you give me an example, best practice, the efficient way or fastet way to load fact table that can be updatetable? If you have link or link about this problem please reply my email. Thanks My datasource from ORACLE and my datawarehouse in mssql2005
What is the best way to move data from Online system tro data warehouse? I have created 3 dimension tables(product,date and customer tables) and I wanna create fact table and get foreign keys from dimension tables. What is the best method to do that in SSIS?
I've got a dimension built from a fact (whatever that's called?) ... it's a date interval field, i.e. 0-5 weeks, 6-10 weeks 11+ weeks. How do I sort these members in the respective order? Looks like this currently:
The problem lies in the fact that I don't have any secondary attributes to order it by, i.e. it's not a physical dimension where I can use a key for the 3 members. I was hoping I wouldn't need to create a separate dimension to get round this.
I created a Fact Table with 3 Keys from dimension tables, like Customer Key, property key and territory key. Since I can ONLY have one Identity key on a table, what do I need to do to avoid populating NULLs on these columns..
I want to update the startdate column (for all rows) so that when period is 0 then the new value is a hardcoded value (say '01-Dec-2000') but for all other rows it takes the value in the enddate column for the row of the previous column (with the same freq)
ie the startdate column for period 1 takes the enddate value for period 0 and so on for a particular freq
create table #periods (period int , startdate datetime , [enddate] datetime , freq int) insert #periods ( period , startdate , enddate , freq) select 0 , '01-Jan-1900' , '31-Jan-2001' , 1 union all select 1 , '01-Jan-1900' , '28-Feb-2001' , 1 union all select 2 , '01-Jan-1900' , '31-Mar-2001' , 1 union all select 3 , '01-Jan-1900' , '30-Apr-2001' , 1 union all select 4 , '01-Jan-1900' , '31-May-2001' , 1 union all select 0 , '01-Jan-1900' , '31-Jan-2002' , 3 union all select 1 , '01-Jan-1900' , '28-Feb-2002' , 3 union all select 2 , '01-Jan-1900' , '31-Mar-2002' , 3 union all select 3 , '01-Jan-1900' , '30-Apr-2002' , 3 union all select 4 , '01-Jan-1900' , '31-May-2002' , 3
/* I know I need a case statement to test for column 0 and to join the table on itself and have put something together but it fails for column 0 and updates to NULL - I think it must be to do with the join ??
This is what I've got so far :
UPDATE PA1 SET PA1.Startdate = CASE WHEN PA2.period = 0 THEN 2000-12-01 00:00:00.000 ELSE PA1.Enddate END FROM #periods AS PA1 JOIN #periods AS PA2 ON PA1.Freq = PA2.Freq AND PA1.Period = PA2.Period + 1
I have a cube with some dimensions (dim1,dim2,dim3).I want to add another dimension (dim4) based on dim1 but more restricted.Eg. Dim1 is a product dimension, and I will create Dim4 from a subset of Dim1 (only few and predetermined products).
I will create a view (or a named query) with the appropriate where clause, create a dimension on this view/namedQ, then add on the existant cube and link it on dimension usage. Great. But in this way in the cube I haven't all fact data but only data for the product of Dim4! (Dim1 is still present).
eg: in fact data I have a total of 100$ of sales for all the products, if I link the Dim4 then the sales amount is 40$, that is the value for the product in Dim4 not for all product (I can't set the filter for Dim1/Dim4 in excel). I haven't found any way for get the 100$.
I try with two Date dimension: the first from 01/01/2010 to 31/12/2015 and the second from 01/01/2015 to 31/12/2015. If I link both I see only data for the second period (only 2015).
it seems that the more restrictive dimension trumps.What's wrong? There are other ways to get what I want? I've tried the property "DependOnDimension" but without results.
I won't publish the attribute used in the WHERE clause on the dimension.
Is it correct to say that for each cube you can have only one Fact Table? I am having a funny dispute just now.
According my experience I never built cubes with more than one fact table, if I want take data from more than one table I write a view and I use it as Fact table...but a cube with two or three fact tables? Never tried..
I have a Fact table that contains several degenerate string values that I have pulled into a Fact Dimension.
When I browse the cube and cut one of the measures by an attribute from the Fact Dimension, I am getting incorrect data.
In other words, when I query the fact table directly via SQL and apply the same filters, I see the data I am expecting to see. But cube browse with same filters yields different results.
How can this happen since the fact dimension has a 1:1 relationship with the fact table.
I do have the Dimension Usage configured properly.
Is this an aggregation thing? Attribute key thing? What am I missing?
I have a big table with several types of transsactions: PO (Puchase Orders, SO,( Sales orders), INV (Invoices) .... I want to create cubes with only one type of transactions (1 cube for PO,...)
Where and how can I filter the rows I want to use in my cube ?
My question here is as the dimension has SCD type 2 on it and every time when there is a change the persn_key gets a new key value but the fact table still points to oldest key.how to update the surrogate key on fact table to the current key value? As per the requirement fact surrogate key must be pointing to current active record on the dimension.
I have a disconnected utility dimension with the following data and a waterfall->subwaterfall hierarchy. I have ignore if same as parent set on sub water fall. The unary operator is assigned to the relevant level
Opening ~ Opening ~ Price ~ Price ~ Total Composition ~ Composition L1 + Total Composition ~ Composition L2 + Total Composition ~ Composition L3 + Total Composition ~ Composition L4 + Total Composition ~ Composition L5 + Closing ~ Closing ~
I'm using MDX to set values to the members
SCOPE(Measures.Waterfall,[Dim Waterfall].[Water Fall].&[Price]); THIS=Measures.LowestPrice; END SCOPE; SCOPE(Measures.Waterfall,[Dim Waterfall].[Sub Water Fall].&[Composition L1]); THIS=Measures.[Composition L1]; END SCOPE;
[Code] ....
But when i run a simple select query
SELECT {[Measures].[Waterfall]} ON COLUMNS, [Dim Waterfall].[Water Fall].[Water Fall] ON ROWS FROM [GPA]
I only get a value for price. Total Composition is NULL. If I change my query to look at the subwaterfall level then I see values for composition L1->L5, so I know there are values there I was expecting the unary operator to aggregate my composition measures into Total Composition.
I should point out that every measure that's being assigned to my utility dim is calculated with in excess of a dozen or so steps and intermediate calcs, is this causing some kind of solve order issue?
I know that as a workaround i could probably do something like
SCOPE(Measures.Waterfall,[Dim Waterfall].[Water Fall].&[Total Composition]); THIS=SUM([Dim Waterfall].[Hierarchy].[Water Fall].&[Total Composition].Children,Measures.Waterfall); END SCOPE;
But that defeats the purpose of the unary operators
Include children and exclude children in a single hierarchy in parent child dimension in mdx
*12-parent **20-parent - 9-parent --250-child1 --210-child2 --240-child3 aggregation of 12-parent only aggregation of 20-parent only aggregation of 9 with children
Can we push the data for the above query in a physical table and create index to make the query fast rather than using the same set tables multiple times
I have a large fact table about 500 million rows, and I am using 2008 r2, thus I am having the file system error, I have browsed online and tried all the fix , but I am still having the error . I tried taking only about year data (which was still around 200 million records) and I was still having the error.
We have a sales fact table that has sales by customers, products, date etc. and dimension tables such as customers, products etc.
We built a multi-dimensional cube on top of these tables. how do we support queries like "show me top 10 products by sales amount and then show customers that didn't buy these products"?
Hello-I'm fairly new to writing SQL statements and would greatly appreciatesome help on this one.I'm working on a project for a non-profit that I volunteer for. Partof the database tracks membership using tables like this:PersonInfo-------------------PersonID (primary key)FirstNameLastNameetc..PeopleMemberships-------------------PPLMembershipIP (primary key)PersonIDMembershipTypeIDFeePaidMembershipTypes--------------------MembershipTypeID (primary key)MembershipYearStandardFeeMembershipDescription (varchar)Just because a person is in PersonInfo, doesn't mean they have anythingin PeopleMemberships (they can be in the databse for other reasons andnot have or have ever had a membership).Membership fees vary by year and type of membership and they want toretain a history of a person's memberships.What I'm looking to do here is write a query (a view in SQL Server)that will return the following InfoPersonID, MostRecentMembershipYear, FeePaidForThatMembership,DescriptionOfThatMembershipI'm thinking that I'd use max(MembershipYear), but that requires groupby for the other columns, so I'm getting all of the people'smemberships returned.I'm pretty sure this can be best done with a subquery, but I'm not surehow.Can someone please point me in the right direction or provide a samplethat I can learn from?Kindly,Ken
Is it possible to design a Cube with a Dimension "DimEmployee" (SCD Type2) and Query the following: What "title" had the person "xyz" at time "2015-01-01"? - See Example data on Images.
I have also a DimTime with Day granularity. The DimEmployee is not at day granularity.