Finding Primary Key Value(s) Having Duplicate Row Information
Feb 23, 2001
Hi,
I am putting my problem in an example as I Feel it would be clear.
Assume my table PEOPLE is having 4 columns with 6 rows, the SlNo being primary key.
SlNo Name LastName birthdate
1 A B x --
2 C B x |-- 1 pair (A, B, x)
3 D E y --|------------
4 A E y | |
5 A B x __| |-- 2'nd pair (D, E, y)
6 D E y ---------------
In this scenario, I need to find SlNo values having similar values in other columns. The o/p for above must be:
1
5
0
3
6
0 (0 needs to include in output for distinction in the sets)
(a)IS THIS POSSIBLE TO DO IN ONE SELECT STATEMET? and HOW?
(b)If I create another temp table tempPEOPLE and select distinct row information of the 2'nd, 3'rd and 4'th columns from the PEOPLE table and
then selecting SlNo's where the information match, I am able to get o/p
1
5
3
6
without 0...and I cannot makeout the distinct sets in this.
HOW DO I FIND THE DISTINCTION IN SETS?
Hi, Imagine that I have a select something like this select * from T where T.Name Like '%Maria%' "Maria" would be what the user what to find. But for example in spanish we use Ã, ó, ú, ü... The same happens in french, german... The user want to find Maria and does not care if another user (or himself) has inserted the client as Maria, MarÃa, Marïa or whatever. Of course one solution is putting in the DB only the simple characters (a,e,i,o,u) and look for these, replacing the "strange" chars in the application to the "normal" ones. But if we want to be professionals we would need to insert in the DB the name in their original spelling (the image is the most important!) Any idea, help or reference? Thx in advance and happy new year. David
I searched for all the posts which covered my question - but none were close enough to answer what i'm trying to do. Basically, the scenario is thus;
Table1 contains values for UserID, Account code, and Date.
My query (below) is trying to find all the accounts assigned to a particular user ID, but also those duplicate account codes which belong to a second user ID. The date column would be appended to the result set.
The query I'm using is as follows;
select acccountcode, userid, date from dbo.table1 where exists (select accountcode from dbo.table1 where accountcode = table1.accountcode group by accountcode having count(*) > 1) and userid = 'x-x-x' order by accountcode
What I think this produces is a list of all files where a duplicate exists, but of course it leaves out the 2nd UserID...which is crucial.
Hopefully this makes sense. Any insight my fellow DBA's can share would be greatly appreciated!
I have a problem with a 3rd party piece of software. Doesn't matter which, really. The problem lies in a table called payments, with a column called txnumber...the newest version of this software fails a check during installation with the message "duplicate txnumber in payment table." Not sure how this could have happened, since there is no way to manually assign the txnumber, but the point is not important. What I'd like to do is figure out a sql script that will return only the duplicate number(s) so that I can either remove or change them manually. Unfortunately, I'm not terribly familiar with sql.
The duplicates that this thread relates to are the kind with duplicate "keyword" entries AND dissimilar field entries; i.e. :
Code:
keyword negative exact broad Phrase Polo 0 122 4 Polo 0 122 5
I've come up with an SQL query that seems to return all of these duplicates (save one of each type- the 'real', unique entry). However I think I made the query very inefficient. My SQL is very bad; this query will be running over tens of thousands of rows, so if it can be at all optimized I would greatly appreciate your help!
What I have so far is:
Code:
string query1 = "SELECT * FROM TableName" + " WHERE EXISTS (SELECT NULL FROM TableName" + " b" + " WHERE b.[keyword]= " + "TableName"+ ".[keyword]" + " AND b.[negative]<> " + "TableName"+ ".[negative]" + " ORb.[keyword]= " + "TableName"+ ".[keyword]" + " ANDb.[exact]<> " + "TableName"+ ".[exact]" + " ORb.[keyword] = " + "TableName"+ ".[keyword]" + " ANDb.[broad]<> " + "TableName"+ ".[broad]" + " ORb.[keyword]= " +"TableName"+ ".[keyword]" + " ANDb.[phrase]<> "+"TableName"+ ".[phrase]" + " GROUP BY b.[keyword], b.[broad], b.[exact]" + " HAVING Count(b.[keyword]) BETWEEN 2 AND 50000)" ;
the algoritm seems to check every column of every row in order to determine a duplicate. Seems straightforward to me, but alas slow...
Is there a better/faster way I can do this? Thanks for you help!
i tried the following query and able to get the list of foreign keys with column names as well as referred tables and referenced column
select parent_column_id as 'child Column',object_name(constraint_object_id)as 'FK Name',object_name(parent_object_id) as 'parent table',name,object_name(referenced_object_id)as 'referenced table',referenced_column_id from sys.foreign_key_columns inner join sys.columns on (parent_column_id = column_id and parent_object_id=object_id) Order by object_name(parent_object_id) asc
but i am not able to get the fks created more than once on same column refering to same pk
Yet another simple query that is eluding me. I need to find records in a table that have the same first name and last name. Because the table has a primaty key, these people were entered twice or they share the same first and last name.
How could you query this:
ID fname lname 10001 Bill Jones 10002 Joe Smith 10003 Sue Jenkins 10004 John Sanders 10005 Joe Smith 10006 Harrold Simpson 10007 Sue Jenkins 10008 Sam Worden
and get a result set of this:
ID fname lname 10002 Joe Smith 10005 Joe Smith 10003 Sue Jenkins 10007 Sue Jenkins
I'm trying to find the primary key on a given table in SQL Server 2000 using SQL. I'm querying the sysobjects table to find a given table, and then querying the sysindexes table. I've ALMOST found what I'm looking for. I see the indexes and columns etc. on the tables in the database, I just don't see the field that indicates that the index is the primary key. Can anyone help? Thanks, Alex
I have a SQL query I need to design to select name and email addressesfor policies that are due and not renewed in a given time period. Theproblem is, the database keeps the information for every renewal inthe history of the policyholder.The information is in 2 tables, policy and customer, which share thecustid data. The polno changes with every renewal Renewals in 2004would be D, 2005 S, and 2006 L. polexpdates for a given customer couldbe 2007-03-21, 2006-03-21, 2005-03-21, and 2004-09-21, with polno of1234 (original policy), 1234D (renewal in 2004), 1234S (renewal in2005), and 1235L (renewed in 2006).The policy is identified in trantype as either 'rwl' for renewal, or'nbs' for new business.The policies would have poleffdates of 2004-03-21 (original 6 monthpolicy) 2004-09-21 (first 6 month renewal) , 2005-03-21 (2nd renewal,1 year), 2006-03-21(3rd renewal, 1 yr).I want ONLY THE LATEST information, and keep getting earlyinformation.My current query structure is:select c.lastname, c.email, p.polno, p.polexpdatefrom policy p, customer cwhere p.polid = c.polidand p.polexpdate between '2006-03-01 and 2006-03-31and p.polno like '1234%s'and p.trantype like 'rwl'and c.email is not nullunionselect c.lastname, c.email, p.polno, p.polexpdatefrom policy p, customer cwhere p.polid = c.polidand p.polexpdate between '2006-03-01 and 2006-03-31and p.polno like '1234%'and p.trantype like 'nbs'and c.email is not nullHow do I make this query give me ONLY the polno 123%, or 123%Sinformation, and not give me the information on policies that ALSOhave 123%L policies, and/ or renewal dates after 2006-03-31?Adding a 'and not polexpdate > 2006-03-31' does not work.I am working with SQL SERVER 2003. Was using SQL Server 7, but foundit was too restrictive, and I had a valid 2003 licence, so I upgraded,and still could not do it (after updating the syntax - things likeusing single quotes instead of double, etc)I keep getting those policies that were due in the stated range andHAVE been renewed as well as those which have not. I need to get onlythose which have NOT been renewed, and I cannot modify the database inany way.*** Free account sponsored by SecureIX.com ****** Encrypt your Internet usage with a free VPN account from http://www.SecureIX.com ***
Hi All, It seems I have been requested to carry out a complex query and the best way I think I can do this is with the use of a stored procedure. The problem is that I am not quite sure whether my SP is stated correctly and also how I would go about stating the SP in my VB.net code!
I would be ever so grateful if somebody could look over my SP code and possibly recommend a way of stating my code. My ability is limited so I would appreciate it if examples could be used with possible relations to my problem.
The SP should state that Department should appear as the end result of the query when the page is loaded. So when a row is selected in tblRisk, dependant upon what the Dept is in that table, it then populates the department in which it is associated with from tblDept. I have left the SP below.
Many Thanks, Kunal
CREATE PROCEDURE dbo.ShowMe @yourInputValue INTAS SELECT tblDept.Department FROM tblDept JOIN tblRisk ON tblDept.Ref = tblRisk.Dept WHERE tblDept.Ref = @yourInputValue RETURN 0GO
What is the best way to compare two entries in a single table wherethe two fields are "almost" the same?For example, I would like to write a query that would compare thefirst two words in a "company" field. If they are the same, I wouldlike to output them.For example, "20th Century" and "20th Century Fox" in the companyfield would be the same.How do I do this? Do I need to use a cursor? Is it as simple as using"Like?"
I have run into a problem, i need to find out that column(s) in a table that makes the primary key. I thought that this code did the trick. *** DECLARE @c varchar(4000), @t varchar(128) SET @c = '' SET @t='contact_pmc_contact_relations' Select @c = @c + c.name + ',' FROM syscolumns c INNER JOIN sysobjects o ON o.id = c.id inner join sysindexkeys k on o.id = k.id WHERE o.name = @t and k.colid = c.colid ORDER BY c.colid SELECT Substring(@c, 1, Datalength(@c) - 1) ***
This works in most of my cases. But i have encounterd tabels where this code doesn't work. Here is a dump from one of the tabels where it doesn't work. SELECT * FROM sysindexkeys WHERE (id = 933578364) <--id of the table *** id indid colid keyno 933578364 1 1 1 933578364 1 2 2 933578364 2 1 1 933578364 3 2 1 933578364 4 3 1 933578364 5 4 1 933578364 6 5 1 933578364 7 6 1 933578364 8 7 1
Not sure if that dump made any sense, but i hope it did. If i look at the table in SQL Enterprise manager there is no relations, no indexes only my primarykey made up with 2 columns (column id 1 and 2).
There are many duplicate records on my data table because users constantly register under two accounts. I have a query that identify the records that have a duplicate, but it only shows one of the two records, and I need to show the two records so that I can reconcile the differences.The query is taken from a post on stack overflow. It gives me 196, but I need to see the 392 records.
How to identify the duplicates and show the tow records without having to hard code any values, so I can use the query in a report, and anytime there are new duplicates, the report shows them.
Hi All, I`m using BCP to import ASCII data text into a table that already has many records. BCP failed because of `Duplicate primary key`. Now, is there any way using BCP to know precisely which record whose primary key caused that `violation of inserting duplicate key`. I already used the option -O to output error to a `error.log`, but it doesn`t help much, because that error log contains the same error message mentioned above without telling me exactly which record so that I can pull that `duplicate record` out of my import data file. TIA and you have a great day. David Nguyen.
I have a Transform Data Task that copies a lot of data from my source system. Unfortunately, I cannot use a DISTINCT in the SQL from the source system, due to a very poor ODBC driver! So, when I am creating my primary key, I am trying to do a lookup on the PK column before I insert the record to see if it exists. If it does, then I skip the row. The lookup references the target database of the task.
The problem I have is that the lookup doesn't find any duplicates loaded from the database. It allows them through and causes the database to throw a primary key error.
Has anyone experienced this, or think they know what I'm doing wrong?
I get the following error when I try to insert a stored procedure in an SQL-server.
Violation of PRIMARY KEY constraint 'PK_Login'. Cannot insert duplicate key in object 'Login'.
My question if this is the real problem or the symtom of something else? I find it hard to believe that try to insert double key values. The table Login doesn´t contain any values.
How can I avoid duplicate primary key error when I use DetailsView Inserting that the field column is one of the primary key ? Thanks in advance ! stephen
when i save this table modifying the pubid and pubcode as primary keys the following error displays...
Unable to create index 'PK_PUBS3'. CREATE UNIQUE INDEX terminated because a duplicate key was found for index ID 1. Most significant primary key is '51'. Could not create constraint. See previous errors. The statement has been terminated.
what i understand is that on the primary key duplicates are not allowed how could i allow it?
I have one table that stores log messages generated by a web service. I have a second table where I want to store just the distinct messages from the first table. This second table has two columns one for the message and the second for the checksum of the message. The checksum column is the primary key for the table.
My query for populating the second table looks like: INSERT INTO TransactionMessages ( message, messageHash ) SELECT DISTINCT message, CHECKSUM( message ) FROM Log WHERE logDate BETWEEN '2008-03-26 00:00:00' AND '2008-03-26 23:59:59' AND NOT EXISTS ( SELECT * FROM TransactionMessages WHERE messageHash = CHECKSUM( Log.message ) )
I run this query once per day to insert the new messages from the day before. It fails when a day has two messages that have the same checksum. In this case I would like to ignore the second message and let the query proceed. I tried creating an instead of insert trigger that only inserted unique primary keys. The trigger looks like:
IF( NOT EXISTS( SELECT TM.messageHash FROM TransactionMessages TM, inserted I WHERE TM.messageHash = I.messageHash ) ) BEGIN INSERT INTO TransactionMessages ( messageHash, message ) SELECT messageHash, message FROM inserted END
That didn't work. I think the issue is that all the rows get committed to the table at the end of the whole query. That means the trigger cannot match the duplicate primary key because the initial row has not been inserted yet.
I have 3 source for IS flow. One is flat file, one is DB table and one is output bad data. It might be a situation when I could have duplicate primary key since records come from 3 sources (flat file, db table, reject (output) table). Can any one give me suggestion how to handle duplicate primary key problem in this situation.
We have a SQL Server 6.5 table, with composite Primary Key, having the Duplicate Entry for the Key. I wonder how it got entered there? Now when we are trying to import this table to SQL2K, it's failing with Duplicate row error. Any Help?
'm trying to import a text file but the primary key column contains duplicatres (tunrs out to be the nature of the legacy data). How can I kick out all duplicates except, say, for a single primary key value?
the point here that i have a small table with two fileds, ID (guid) as primerykey RAF(char) and the table is empty when i add a new row i recieve this exception, Violation of PRIMARY KEY constraint 'PK_tblType'. Cannot insert duplicate key in object 'dbo.tblType'. i found no way to solve the problem. thanks in advans
I have table variable in which I am inserting data from sql server database. I have made one of the columns called repaidID a primary key so that a clustered index will be created on the table variable. When I run the stored procedure used to insert the data. I have this error message; Violation of Primary key Constraint. Cannot insert duplicate primary key in object. The value that is causing this error is (128503).
I have queried the repaidid 128503 in the database to see if it is a duplicate but could not find any duplicate. The repaidID is a unique id normally use by my company and does not have duplicates.
I want to import a data file into a sql table. The table has a primary key but the data could have a duplicate value in the PK column (error in the source data). How can I "trap" for this type of error in SSIS?
I am using the Import/Export wizard to import data from an ODBC data source. This can only be done from a query to specify the data to transfer.
When I try to create the tables, for the query, I am getting the following error:
Msg 2714, Level 16, State 4, Line 12
There is already an object named 'UserID' in the database.
Msg 1750, Level 16, State 0, Line 12
Could not create constraint. See previous errors.
I have duplicated this error with the following script:
USE [testing]
IF OBJECT_ID ('[testing].[dbo].[users1]', 'U') IS NOT NULL
DROP TABLE [testing].[dbo].[users1]
CREATE TABLE [testing].[dbo].[users1] (
[UserID] bigint NOT NULL,
[Name] nvarchar(25) NULL,
CONSTRAINT [UserID] PRIMARY KEY (UserID)
)
IF OBJECT_ID ('[testing].[dbo].[users2]', 'U') IS NOT NULL
DROP TABLE [testing].[dbo].[users2]
CREATE TABLE [testing].[dbo].[users2] (
[UserID] bigint NOT NULL,
[Name] nvarchar(25) NULL,
CONSTRAINT [UserID] PRIMARY KEY (UserID)
)
IF OBJECT_ID ('[testing].[dbo].[users3]', 'U') IS NOT NULL
DROP TABLE [testing].[dbo].[users3]
CREATE TABLE [testing].[dbo].[users3] (
[UserID] bigint NOT NULL,
[Name] nvarchar(25) NULL,
CONSTRAINT [UserID] PRIMARY KEY (UserID)
)
I have searched the "2714 duplicate error msg," but have found references to duplicate table names, rather than multiple field names or column name duplicate errors, within a database.
I think that the schema is only allowing a single UserID primary key.
I have some troubles with IBM WebSphere Application Server using MS SQL Server 2005 JDBC Driver. I always get the error e.g. java.lang.SecurityException: class "com.microsoft.sqlserver.jdbc.SQLServerDatabaseMetaData"'s signer information does not match signer information of other classes in the same package
I found this Feedback but it seems to be closed.
A temporary solution for me was to delete the meta-inf directory of the JAR-File, but that can't be the solution.