Data Mining In 2005 - Same Input Columns Resulting In Different Decision Trees

Dec 12, 2007

While recently working with several mining models, I came across something that struck me as pretty odd - and I'm hoping to find an explanation for the behavior.

Consider the following setup:


A single table in the relational database represents the only case table
A single, continuous column is the predictable
A mining structure has been created

The mining structure contains a single model, based on the MS Decision Trees algorithm
Input columns were selected for the model via the BI Studio wizard (i.e., those provided via the "Suggest" button)
The structure has been fully processed
Now, the interesting parts:


I view the scatterplot for the mining model, under the Mining Accuracy Chart tab
Back on the Mining Structure tab, I delete one of the input columns
I add the same column back into the structure
The structure is fully processed again
When I view the scatterplot for the mining model, under the Mining Accuracy Chart tab, a different set of data points are presented for the model predictions
A different set of decision trees under the Mining Model Viewer tab confirms thisHow could different patterns have been found this second time around, even though all of the input columns were the same (as well as the training cases)?

(Note: I encountered this situation while creating a new mining model that was identical to an existing one. Even though the models received the exact same inputs and training cases, they yielded different results. I was able to reproduce the behavior by using steps 1-6 above, though.)

Can someone provide some insight on this behavior, or some kind of explanation of what may be happening?


Thanks,
Joe Miller

View 3 Replies


ADVERTISEMENT

Data Mining :: Informational (Data Mining) - Decision Trees Found No Splits For Model

Sep 29, 2015

I followed the tutorial posted at [URL] ...

Everything was ok until the last step where I had to process the mining structure which resulted in a warning

"Informational (Data mining): Decision Trees found no splits for model, Tbl Decision Tree Example."

What does this error mean? How do I resolve it? Also, I only see the first level in the Mining Model Viewer, I don't see the levels 2 and 3.

View 2 Replies View Related

Decision Trees

Nov 6, 2006

I am studying the behavior of 200.000 clients. With the use of decision trees I would like to know if my clients will abandon our service or not. I use a training set of 21.822 clients and I use a predict variable "aband" wich is a discrete variable and it can be 0 or 1. In my training set i have 21.597 cases in which aband is 0 and 255 cases in which aband is 1. Looking at the classification matrix obtained using as input table a testing set (unselected data) I can see that my decision tree doesn't recognize the cases in which aband is 1. Here is the Classification Matrix:
Counts for Dati Training on [Aband]
Predicted 0 (Actual) 1 (Actual)
0 21597 225
1 0 0

What should I do?

Chiara

View 3 Replies View Related

Decision Trees, DMX And CONTAINS (T-SQL)

May 18, 2006

I would appreciate answers to the following doubts I have regarding Decision trees, CONTAINS and using CONTAINS in a DMX query:

1. Does MS decision tree work only off equality/inequality conditions for the nodes? Is it possible to use a predicate as the branch criteria for a node?

2. Can the T-SQL predicate CONTAINS(...) be used in a DMX query? I need to check if a column-value is a substring of another column and create an intermediate column that will enable me to construct a decision tree with the phrase-present/absent branch.

3. Can CONTAINS(...) be used in a select clause? Like -

SELECT CONTAINS(JAT.column1, '"Good day"')

FROM JustAnotherTable;

4. Does CONTAINS(...) support both arguments to be column references? Or, is it mandatory that the pattern (argument #2) has to be a literal string or a variable? E.g.: I need to know the validity of the following expression -

SELECT * FROM JustAnotherTable JAT

WHERE CONTAINS(JAT.column1, JAT.column3);

View 1 Replies View Related

MS Decision Trees Question

Aug 3, 2007

Hi,

I'm new to data mining, and have created an MS decision trees model. The model has the columns age, call outcome, call reason, country name, employee name and gender - all as inputs.

In the mining model viewer, I only get nodes for the age, despite having data for all the other columns.

Can anyone help?

Thanks
Jeremy

View 12 Replies View Related

Decision Trees Parameters

Dec 8, 2007



Hi,

I'm interested in understanding how the parametes work in the MS Decision Trees algorithm.

As far as I can tell, the MINIMUM_SUPPORT and COMPLEXITY_PENALTY parameters both control the number of splits and hence the depth of the tree.

Unfortunately the BOL descriptions are very brief - so can anyone tell me the difference between these 2 parameters?

Thanks
Jeremy

View 1 Replies View Related

Calculating R-Squared From Decision Trees

Mar 19, 2007

Hello.

I am trying to build a decision tree to predict prices. I have created the tree and looked at the lift charts, but I have not seen any of the traditional statistics I am used to from other programs (R-Squared, F statistics, etc.).

Does anyone have an example of how they calculated R-Squared for a decision tree on a continuous variable?

Thanks,
Brian

View 9 Replies View Related

Predict Probability In Decision Trees

Dec 13, 2006

Hello,

I installed the bike buyer example and i am learning the DMX language. Now i wrote the following query (using MS decision trees):

SELECT
T.[Last Name],
[Bike Buyer],
PredictProbability(Predict([Bike Buyer])) AS [Probability]
From
[v Target Mail]
PREDICTION JOIN
OPENQUERY
(....... And so on..)

Now the result is surprising to me. In the resulttabel all the probabilities are equal.

Bike Buyer Probability
1 0.99994590500919611
0 0.99994590500919611
0 0.99994590500919611
0 0.99994590500919611
0 0.99994590500919611
1 0.99994590500919611

and so on.

Now i am wondering what predictProbability means. I thought that PredictProbability meant the probability that the prediction is correct. Now all the probabilities are the same and the input is different. Can somebody tell me what PredictProbability means or am I using it wrong?

Thanx in advance,

Joris Valkonet

View 6 Replies View Related

Forcing A Branch With Decision Trees

Sep 12, 2007



In a decision tree algorithm, is there a known way to force a branch at a top level? For exmaple, I have 30 known decision patterns that are going to be completely different and I don't want them to intermingle. I wanted to force a branch at the top node on one of the 30 patterns so I wouldn't have to create 30 mining models per client.

Brian

View 4 Replies View Related

Question About Decision Trees And Neural Networks

May 9, 2006

I have some accounting data, with some transaction attributes and amounts.
I'm using Decision Trees to try and predict the next month's amount for certain combinations of attributes.

I've tried two different structures for the model:

A: one with 9 discrete text input attributes.
B: And another with the same 9 attributes + a avarage Amount for all combinations of the nine attribute for every transaction.


When i've processed them and look in the dependency network, it says that the strongest link for the structure A is attribute "1".
And for the second its the avarage-Amount attribute.
Okey, that seems fine, but the second strongest link in structure B is attribute "2".

Shouldn't it be attribute 1 like in structure A?



Second question, if I run the same data in a Neural Network model, the prediction becomes much worst then the decision tree.
I get many predictions that are negative values even though all training data contains positiv values.
The StDev becomes the same for every row also..
What am I doing wrong with that one. I have alot of transactions and a read somewhere that a Neural Network should work better than a decision tree in a case similar to mine.
The score in the "Lift chart" for the Neural Network model becomes 0,00 and for Decision Trees with the same data I get around 110.

View 1 Replies View Related

Decision Trees, How Is Prediction Probability Calculated?

Jul 26, 2007

How is the value of Prediction Probability calculated in the context of decision trees?

View 7 Replies View Related

'Decision Trees Found No Appropriate Regressors For Model' Question

Dec 7, 2006

Hi,

I am using MS Decision Trees algorithm and for a specific model i get the above warning.As a result of that i dont get any splits in my tree. Is there anything i can do to avoid this?

Thank you for reading

View 1 Replies View Related

Error: Decision Trees Found No Splits For Model

Jan 5, 2007

Hi,

I am trying to run one of the mining models from the book "Delivering BI using SQl Server 2005" but I am running into "Decision Trees found no splits for model". The mining structure has 4 columns, the fourth one being marked as "Predict Only". My Cube slice for the model has sufficient data in the cube. I am lost.. Help!!

Regards

View 4 Replies View Related

Problems With Input Columns For Mining Models

Nov 20, 2006

Hi, all experts here,

Thank you very much for your kind attention.

I got a strange problem with SQL Server 2005 data mining models though. I have selected the input columns for my mining model (which are different from the input columns for its mining structure, since I ignored some of the columns for the selected model). But the mining model still used all input columns from the mining structure rather than those I chose for the mining model.

Would please any one here give me any guidance and advices for that. Really need help for that.

Thanks a lot in advance for any help.

With best regards,

Yours sincerely,

View 7 Replies View Related

Data Mining Model Viewer Of Decision Tree Out Of Memory Error

May 18, 2007

We've successfully processed a large decision tree model in SQL Server 2005. When I try to view the tree in the mining model viewer, I get the following error:



TITLE: Microsoft Visual Studio
------------------------------

The tree graph cannot be created because of the following error:

'Exception of type 'System.OutOfMemoryException' was thrown.'.

For help, click: http://go.microsoft.com/fwlink?ProdName=Microsoft%u00ae+Visual+Studio%u00ae+2005&ProdVer=8.0.50727.42&EvtSrc=Microsoft.AnalysisServices.Viewers.SR&EvtID=ErrorCreateGraphFailed&LinkId=20476


The link provides no other documentaiton on the error.



We're using 64-bit SQL on a Dell Workstation running XP-64 with 16GB of memory. From my view of things we aren't close to running out of memory. Since the model processed and the error occurs when viewing the model, is this a problem with Visual Studio and nont necessarily Anlaysis Services?



Thanks in advance.



Nick

View 4 Replies View Related

Data Mining :: MS Excel Data Mining Add-ins / Maximum Number Of Predict Columns

Jun 4, 2015

Having successfully created :

- a data mining structure with about 80 columns.
- a data mining model using Microsoft_Decision_Trees with 2 prediction columns. 

I thought I would then explore the possibility of have more than 2 prediction columns, in this case 20.

I get an error message and I can't work out :
a) if this is because there's a limit to the maximum number of prediction columns and where that maximum is stated.
b) if something else has become corrupted
c) there's a know bug and if the error message is either meaningful or not.

Either way, I'm unable to complete the data mining wizard 

The error message is :Errors in the metadata manager. Either the mining structure with the ID of '[my model Structure]' does not exist in the database with the ID of 'DMAddinsDB', or the user does not have permissions to access the object.

View 3 Replies View Related

Where Can I Store The Mining Results From Mining Models In SQL Server 2005 Data Mining Engine?

Apr 26, 2006

Hi, all here,

I am wondering where can I store my mining results in data mining engine? For example, I got mining results like accuracy chart, decision trees, and other formats of results based on different mining algorithms I used for my data mining, so where can I actually store the results for reporting service use later? Is it possible to do that in SQL Server 2005?

Thanks a lot for any help and guidance in advance.

View 4 Replies View Related

How Can We Compare Other Data Mining Software Packages With SQL Server 2005 Data Mining?

Feb 23, 2007

Hi, all experts here,

I would like to know if there is any way to migrate third-party data mining packages with SQL Server 2005 data mining algorithms together then we can have a comparison among all of them to get the best results for training models.

I am looking forward to hearing from you.

Thanks a lot.

With best regards,

Yours sincerely,

View 1 Replies View Related

Data Mining Problem: Is That Possible To Predict Many Many Columns?

Mar 22, 2007

Hello,

Can someone please assist?
I have no problem using the provided Algorithms (NaiveBayes, Decision Tree, etc) from SQL Server 2005 Data Mining. For example: If I want to predict whether the customers want to buy bike from the following data, then I use Age, Salary, Gender as input/attribute/feature selection and BuyBike column as "Predict" column.

Table
Age Salary Gender BuyBike
------------------------------------

However, say that I have 10,000 types of bikes to predict. How to do that?
Age Salary Gender BuyBike1 BuyBike2 BuyBike3 ...... BuyBike10000
------------------------------------------------------------------------------------

Are there any online resources discussing this issue? I am desperately try to solve this problem. Please assist!



Mary

View 5 Replies View Related

Data Mining :: Merge Two Or More Columns In Separate Tables

Jul 14, 2015

SQL 2008R2

Request is to merge or join or case stmt or union or... from up to four unique columns all in separate tables to new combined table (matrix) of results from said.

What is example of best method to do this ?

View 2 Replies View Related

Data Mining :: Updating Two Columns With Sequence Numbers With Where Clause

Jun 29, 2015

I have a question  in SQL server. For example I have a table which has two column like following table and I don't know how can I update theses two column with identity numbers but just the fields which are equal 111.

Example table:
numez numhx
111 111
111 111
0 111
111 0
111 0

and my results should be:
numez numhx
1 2
3 4
0 5
6 0
7 0

View 3 Replies View Related

Map Resultset From Executing A Stored Proc Into Input Columns Of A Data Flow Task

Jul 30, 2007



I need to loop the recordset returned from a ExecuteSQL task and transform each row using a Data Conversion task (or a Script Task).

I know how to loop the recordset returned by an ExecuteSQL task:

http://www.sqlis.com/59.aspx

I loop the returned recordset (which is mapped to a User variable of type System.Object) and assign the Variable Mappings in the ForEach Loop to different user variables which map to the Exec proc resultset (with names and data types).

I assume to now use these as the Available Input columns for the Data Conversion task, I drag a Data Flow task inside the For Each Loop container and double-click it, then add a Data Conversion task.

But the Input columns (which I entered in the Variable Mappings in the ForEach Loop containers) dont show up in the Available Input columns of the Data Conversion task.

How do I link the Variable Mappings in the ForEach Loop containers from the recordset returned by the Execute SQL Task to the Available Input columns of the Data Conversion task?

.......................

If this is not possible, and the advice is to use the OLEDB data flow as the input for the Data Conversion task (which is something I tried too), then the results from an OLEDB Command (using EXEC sp_myproc) are not mapped to the Available Input columns of the Data Conversion task either (as its not an explicit SQL Statement and the runtime results from a stored proc exection)

I would like to use the ExecuteSQL task to do this as the Package is clean and comprehensible. Which is the easiest best way to map the returned results from a Stored proc execution to the Available Input columns of any Data Flow transformation task for the transform operations I need to execute on each row of data?

[ Could not find any useful advice on this anywhere ]

thanks in advance!

View 4 Replies View Related

Data Mining - Scalar Mining Structure Column Data Type Error...

May 31, 2006

Hoping someone will have a solution for this error

Errors in the metadata manager. The data type of the '~CaseDetail ~MG-Fact Voic~6' measure must be the same as its source data type. This is because the aggregate function is not set to count or distinct count.

Is the problem due to the data type of the column used in the mining structure is Long, and the underlying field in the cube has a type of BigInt,or am I barking up the wrong tree?

View 16 Replies View Related

Data Mining :: Content In Data Mining Structure Not Valid - Deployment Fails

Apr 30, 2015

I'm a beginner with SQL 2012 SSDT & SSMS. I get this error message when I try to deploy my project: 

"Error 6
Error (Data mining): KEY SEQUENCE columns are not supported at the case level. The 'Customer Key' column of the 'TK448 Ch09 Cube Clustering' mining structure contains content that is not valid.
0 0
"
I am finding it hard to locate the content that is not valid. I've been trying to find a answer for this problem but can't seem to find anything. How can I locate the content that is not valid and change or delete it so that I can deploy this solution?

View 2 Replies View Related

Error (Data Mining): The 'HISTORIC_MODEL_GAP' Data Mining Parameter Is Not Valid

Oct 25, 2007

Hi all,

I am using Microsoft_Time_Series and have set HISTORIC_MODEL_GAP to various values (from 1 to 21). I always get this error:
Error (Data mining): The 'HISTORIC_MODEL_GAP' data mining parameter is not valid for the 'My Time Series' model.

In Algorithm Parameters window, this parameters is not there by default, so I have to add it.

Any tip will be greatly appreciated.

View 3 Replies View Related

Data Mining :: Implementing Excel Data Mining In A Classroom Setting?

Jun 15, 2015

Implementing data mining Add-in in an academic setting?  We need to handle over 150 new students a semester and have their connection to Analysis Services survive for their four years at the college.  We are introducing data mining to every freshman business student as a unit within their Intro to Excel class (close to a month of work to give them a sense of what is possible).  Other courses later in their curriculum will expand on that introduction. 

Once implemented, we would have as many as 900 connections to manage (four years from now).  It is possible that multiple sections will be running at the same time, so 40 students may be accessing the data mining tools concurrently.   

Is there a way to "bulk establish" the access credentials and establish those databases?

View 4 Replies View Related

Importing Of Data, Resulting In Duplicates, HELP!

Oct 18, 2004

hi all,

I have a problem with SQL 2000 here...
I juz imported the same set of data to my database table and it gave me duplicated records as all the data is imported although there is same existing data.

Can anyone help me with this?
How can i import the data such that the data which already exists in the database will not be imported in again?

Thanks in advance.

View 4 Replies View Related

Data Mining On SQL 2005 Database

Aug 23, 2006

Is it possible to enable data mining tools on SQL SERVER 2005 database.

i need some tutorials on this? Please help ou

Ronald

View 8 Replies View Related

Data Mining :: How To Reduce Data Mining Processing Time

Aug 4, 2015

With SASS Database i have created Data mining Structure Using Time series algorithm, while processing the SSAS db, Data mining  taking long time to process, so how we can  reduce processing time ???

View 2 Replies View Related

The SQL Server 2005 Data Mining Add-ins For Office

Feb 29, 2008

I am trying to use a new microsoft add-in for office 2007. I installed the 180 day trial version of SQL Server 2005 and according to the instructions it was suppose to be very easy to connect the add in to SQL. I am receiving an error message which I cannot find a resolution to using the readme file and wonder if you can help. Here is the url to the readme file and a screenshot with a summary of the add in.

http://download.microsoft.com/download/5/0/e/50ec0a69-d69e-4962-b2c9-80bbad125641/ReadmeSQL2005.htm

http://download.microsoft.com/download/5/0/e/50ec0a69-d69e-4962-b2c9-80bbad125641/ReadmeSQL2005.htm#_3461_accessing_setup_documentation_cuy1



http://download.microsoft.com/download/5/0/e/50ec0a69-d69e-4962-b2c9-80bbad125641/RequirementsSQL2005.htm

http://www.microsoft.com/sql/technologies/dm/addins.mspx

ERROR MESSAGE
Unable to connect to server 'localhost'. Please make sure user 'ARTIMUSArt McCarty' has at least read permission to some database on the server.

DETAILS ON TH ADDIN
The SQL Server 2005 Data Mining Add-ins for Office 2007 allow you to uncover hidden patterns and relationships in your data and then put them to work to enhance the quality of your analysis.

The package you downloaded allows you to install the following add-ins:

Table Analysis Tools for Excel
With a couple of mouse clicks you can detect and analyze the key influential factors for values in your data, highlight values that don't fit with the rest of the data. More

Data Mining Client for Excel
Go through the full data mining model development lifecycle within Excel by using your spreadsheet data, or by using external data accessible through your Analysis Services database. More

Data Mining Templates for Visio
Render and share your mining models as Visio drawings that you can annotate. More
Thanks for your help.
Art

View 1 Replies View Related

Data Mining Formation For SQL Server 2005

Sep 14, 2006

Hi,

I already have data mining experience with different software, but my company is now migrating to SQL Server 2005 and since it include a data mining module, I would like to learn how to use it properly (Analysis Services, DMX language DMX, €¦).

Is there companies that offers formation in data mining for SQL Server 2005 ?

Thanks!

Tony

View 3 Replies View Related

Book Data Mining With SQL Server 2005

Oct 24, 2007

Hi all,

Is this book still the only book written for SQL2005's data mining? Does anyone know where I can find its errata? I have never seen so many editorial errors (typos, mislabeling, etc.) in other books. I am not worried about those obvious errors, but I am afraid that some errors may be so deceiving that when I find out, a lot of time will have been spent on misguided effort.

View 4 Replies View Related

How Is The 'Score' Value Derived In The Lift Chart/Mining Legend For Data Mining Models?

Sep 26, 2006

Hi,
I have just run a simple data set through a model to predict a simple true or false value (i.e. binary output)
The Lift Chart/Mining Legend in Analysis Services shows three results €“ Score, Population Correct (%), and Predict Probability (%)

Population Correct I beleive is the percentage of predictions it got right out of the total number of predictions it tried to make. Is this correct?

However, I can€™t work out how the other two are derived in particular the 'SCORE'. To give a live example the scores were as follows:

Model Score Pop Correct Pred Probability
Decision Trees 0.83 76.59% 54.28%
Neural Network 0.75 67.63% 50.05%
Ideal Model 100.00%


Can anyone help with this and give a detailed explanation?

Many thanks,
S Rajput

View 4 Replies View Related







Copyrights 2005-15 www.BigResource.com, All rights reserved