SQL Server 2012 :: Finding Duplicates And Show Original / Duplicate Record

Nov 3, 2014

There are many duplicate records on my data table because users constantly register under two accounts. I have a query that identify the records that have a duplicate, but it only shows one of the two records, and I need to show the two records so that I can reconcile the differences.The query is taken from a post on stack overflow. It gives me 196, but I need to see the 392 records.

How to identify the duplicates and show the tow records without having to hard code any values, so I can use the query in a report, and anytime there are new duplicates, the report shows them.

SELECT
[groom_first_name]
,[groom_last_name]
,[bride_first_name]
,[bride_last_name]

[code]....

View 5 Replies

SQL Server 2012 :: Select Statement - Finding Duplicate Records

Feb 18, 2014

I am trying to build a select statement that will

1). find records where a duplicate field exists.
2.) show the ItemNo of records where the Historical data is ' '.

I have the query which shows me the duplicates but I'm stuck when trying to only show records where the Historical field = ' '.

Originally I thought I could accomplish this goal with a sub query based off of the original duplicate result set but SQL server returns an error.

Below you will find a temp table to create and try in your environment.

create table #Items
(
ItemNovarchar (50),
SearchNo varchar (50),
Historical int

[Code] ....

Ultimately I need a result set that returns 'ATMR10 MFZ N', and 'QBGFEP2050 CH DIS'.

View 3 Replies View Related

SQL Server 2012 :: Using Select Query To Get 2014-08-09 11:13:03 From Original Date Records?

Aug 18, 2014

I have date field in table and data contain 2014-08-09 11:13:03.340

when I use smalldatefield I am getting 2014-08-09 11:13:00 I should have got 2014-08-09 11:13:03

Why the 03 is trimming in smalldatefield.

how do I use select query to get 2014-08-09 11:13:03 from original Date records

View 1 Replies View Related

Finding Duplicates In SQL

Jul 26, 2002

I have a customer database with the following structure:

Dept ID (int)
Section (varchar)

I need to find only occurrences of a section (eg Admin) where the section name has a record in Dept 1,2 and 3 - only return the result if the record for Admin is associated will these depts.

Example Data:

Dept Section
1 Admin
1 Finance
1 Marketing
2 Admin
2 Sales
2 Marketing
3 Admin
3 Sales
3 Finance

The query would only return 'Admin' since this is the only Section that is represented in Departments 1,2 + 3.

I am not in control of the data design and can't create a bitwise field to achieve this.

tia,
Steve

View 1 Replies View Related

Finding Duplicates

Jul 7, 2004

I am trying to complete an insert from query but the problem is I have duplicates, so I'm getting an error message. So to correct it I am creating a Find Duplicates statement in the Query analyzer but Its not working can someone tell me whats wrong with this statement (by the way I'm in SQL 2000 Server)

thank you

SELECT EmployeeGamingLicense [TM#]AS [TM# Field], Count([TM#])AS NumberOfDups
FROM TERMINATION
GROUP BY [TM#]
HAVING Count([TM#])>1;
GO

View 14 Replies View Related

Finding Duplicates - Is This Right?

May 16, 2007

I've done a search and I THINK I've got my head round this, but I'd be very grateful if someone could reassure me:

SELECT Email FROM List1 WHERE EXISTS (
SELECT Email FROM List2 WHERE List2.Email= List1.Email
) AND List1.Email <> '44'

That will give me every email address from list one that (a) appears in list two, and (b) isn't '44'. Right?

And to find all the emails from List1 that DON'T occur in List2 (and aren't '44'), I just put "NOT" in front of "EXISTS". Right?

Sorry for asking an obvious question but I'm having a real mental block here. :o

View 6 Replies View Related

Finding Duplicates

Jun 5, 2008

how can i find duplicates in my database? i want to find them in the website and expressemail field

View 5 Replies View Related

Finding Duplicates Using More That One Key

Jun 16, 2008

Hi guys,

I was just wondering if any one knows how to find duplicate keys using more than one field. I used the below key to find those people who exists in list1 but don't exists in list2. I realized that the results had some duplicates which was expected but how do I then find all those duplicate people. I know how to do it if there was a primary key present I would have done a count (distinct cardnumber) > 1 and i would have done the select statement like this distinct cardnumber, but how do I do it with more that one key??

View 1 Replies View Related

Finding Duplicates

Jul 20, 2005

I have a company table and I would like to write a query that will return tome any duplicate companies. However, it is a little more complicated thenjust matching on exact company names. I would like it to give me duplicateswhere x number of letters at the beginning of the company name match AND xnumber of letters of the address match AND x number of letters of the citymatch. I will be doing this in batches based on the first letter of thecompany name. So for example I will first process all companies that startwith the letter "A".So for all "A" companies I want to find companies where the first 5 lettersin the company name match and the first 5 characters of the address fieldmatch and the first 5 characters of the city match. THANKS!!!

View 3 Replies View Related

Finding Duplicates?

Oct 9, 2007

I'm trying to find duplicates. Any help will be greatly appreciated! Here's my query:

I'm trying to find only the records where the "Sales_Header_Your_Reference" field is used more then once with a different "Sales Header Document ID" So it looks like this
Sales Header Document ID Sales_Header_Your_Reference
1955718 0002377729
2082721 0002478945
2082728 0002478976
2093598 0002487318
2093599 0002487318
2093601 0002487332
I hope im clear enough, here is my query. I get then listed right, but the count is all wrong, I only want to show the ones , like the one i have highlighted.

Code Block

Select COUNT(Sales_Header_Your_Reference),Tbl_Sales_Header.Sales_Header_Document_Id,Tbl_Sales_Header.Sales_Header_Your_Reference
FROM dbo.Tbl_Sales_Header
INNER JOIN dbo.Tbl_Sales_Invoice_Header
ON dbo.Tbl_Sales_Header.Customer_Bill_Customer_No = dbo.Tbl_Sales_Invoice_Header.Sales_Invoice_Header_Bill_Customer_No AND
dbo.Tbl_Sales_Header.Sales_Header_Order_DateTime = dbo.Tbl_Sales_Invoice_Header.Sales_Invoice_Header_Order_Datetime
WHERE (dbo.Tbl_Sales_Header.Sales_Header_Order_DateTime > CONVERT(DATETIME, '2007-09-15 00:00:00', 102)) AND
(dbo.Tbl_Sales_Header.Customer_Bill_Customer_No = 'Butler') OR
(dbo.Tbl_Sales_Header.Customer_Bill_Customer_No = 'MWI') OR
(dbo.Tbl_Sales_Header.Customer_Bill_Customer_No = 'NLS') OR
(dbo.Tbl_Sales_Header.Customer_Bill_Customer_No = 'HSI') OR
(dbo.Tbl_Sales_Header.Customer_Bill_Customer_No = 'NLSMVD') OR
(dbo.Tbl_Sales_Header.Customer_Bill_Customer_No = 'RNEXT')
GROUP by Tbl_Sales_Header.Sales_Header_Document_Id,Tbl_Sales_Header.Sales_Header_Your_Reference
HAVING COUNT(Sales_Header_Your_Reference) > 1
order by Tbl_Sales_Header.Sales_Header_Document_Id, Tbl_Sales_Header.Sales_Header_Your_Reference

View 9 Replies View Related

Finding Duplicates Regardless Of Special Characters - Help Plz

Nov 22, 2005

I am trying to move distinct data from one table to another based on two columns regardless of special characters such as : ( ) ; ' " / - _ = >< etc. The two columns that are the defining factors that I need to match as duplicates are Col1 & Col3, the rest I want to get the maximum data from each column for that row (I hope this made sense).

Here is what I have tried, but it does not seem to work:

Code:

select
ID,
Col1,
max(Col2)as Col2,
Col3,
max(Col4)as Col4,
max(Col5)as Col5,
max(Col6)as Col6,
max(Col7)as Col7,
max(Col8)as Col8,
max(Col9)as Col9
where patindex('%[^-:]%',Col1) = 0 AND patindex('%[^-:]%',Col3) = 0
into Newtable
from dbo.CLEANING_bk
group by Col1,Col3

Basically if there are 10 rows of duplicate data based on Col1 & Col3 (regardless of special characters), I want to move the distinct info to a new table with the maximum amount of info from the other columns of the duplicate rows.

If row 3, 4,5, & 6 are all considered duplicate and Row3/Col2 has info in it that row 4 & 5/Col2 does not, I want to combine it with the single row I export to the new table retaining as much information from the rows of the duplicates as possible.

Can anyone help and let me know why my script is not wanting to play nice?

View 14 Replies View Related

Finding Duplicates - What On Earth Am I Overlooking?

Aug 15, 2006

I have two lists of contacts. They're similar. I want a list of all the contacts whose email address occurs only in the first list.

SELECT COUNT(DISTINCT EMAIL) FROM List1
returns 13460

SELECT COUNT(DISTINCT EMAIL) FROM List2
returns 13220

SELECT EMAIL FROM List1 WHERE EMAIL NOT IN (SELECT DISTINCT EMAIL FROM List2)
returns 0 rows

How can it be returning no rows? What am I failing to take into consideration?

:confused:

View 6 Replies View Related

Finding And Replacing Duplicates In Sql Using A Script

Mar 27, 2008

Finding and replacing duplicates

I have this project; I am trying to work on but can��t because of duplicated data in that table.

I tried to make the culume not accept duplicates, but I need to clean it up first and I can just delete them.

Currently the values are 11 digit numbers and I need to replace them with 00000000001

incrementing, making my last one 00000003000

Any idea on where to even begin? I am new to this.

What I am trying to do.

Find replicates numbers in TRAN_DATA

Replace each with a new number but increment the new number starting with 00000000001 and ending with 00000003000

With this I knew I had 3000 plus.
ELECT TRAN_LCTR_NR,
COUNT(TRAN_LCTR_NR) AS NumOccurrences
FROM TRAN_DATA
GROUP BY TRAN_LCTR_NR
HAVING ( COUNT(TRAN_LCTR_NR) > 1 )

View 5 Replies View Related

Query To Show Duplicates

Aug 16, 2005

mytable fld1 int primkey fld2 varchar(20), fld3 varchar(20)From the definition of the above table, how do i do i modify my below query to only select the rows with duplicates in fld2. I know a groupby with a having count will display the duplicates for a given field, but i want my query to see all the fields and rows that are duplicates. select fld1, fld2, fld3 from mytable

View 1 Replies View Related

How To Show Duplicates In Reports

May 31, 2006

I'm trying to create a report in Sql Server 2005 Reporting Services where it will display duplicates rows in the reports. I'm unable to find the property that enables this. I would need something like HideDuplicates = false, but that is not the way it is intended to be used. If HideDuplicates is left blank, it seems to still hide duplicate values if the preceding row matches.

The report is a crosstab type, so we are using a Matrix for the report item. I'll illustrate with a simple example:

This is what we currently get:

Catagory
Class
Detail

Type 1
A
Item 1

B
Item 2

Item 3

Type 2
A
Item 4

B
Item 5

This is what we want:

Catagory
Class
Detail

Type 1
A
Item 1

Type 1
B
Item 2

Type 1
B
Item 3

Type 2
A
Item 4

Type 2
B
Item 5

I'm sure its something simple, but we just cant seem to find the solution,

Thanks

Darin

View 9 Replies View Related

Transact SQL :: Comparing Records - Finding Matches / Duplicates

Nov 20, 2015

I have this 40,000,000 rows table... I am trying to clean this 'Contacts' table since I know there are a lot of duplicates.

At first, I wanted to get a count of how many there are.

I need to compare records where these fields are matched:

MATCHED: (email, firstname) but not MATCH: (lastname, phone, mobile).
MATCHED: (email, firstname, mobile)
But not MATCH: (lastname, phone)
MATCHED: (email, firstname, lastname)
But not MATCH: (phone, mobile)

View 9 Replies View Related

SQL Server 2008 :: Not Finding A Record With A Space In The Middle Of String

May 19, 2015

Ok following scenario

create table dbo.ADAssets
(Computername nvarchar(max))
insert into dbo.ADAssets
values
('AIRLBEOF3565 CNF:4e926e06-6f62-4864-aebd-6311543d',
'AIRLBEOF3565')

So if execute the following

select Computername
from dbo.ADAssets
where Computername like
'AIRLBEOF3565%'

I get both records,but if I do this

select *
from dbo.ADAssets
where Computername in
(
'AIRLBEOF3565 CNF:4e926e06-6f62-4864-aebd-6311543d',
'AIRLBEOF3565'
)

I only get AIRLBEOF3565

So the big picture is that I need to compare 2 tables to find records that match & don't but that I get matches that shouldn't be & matches that aren't.

View 4 Replies View Related

TOUGH INSERT: Copy Sale Record/Line Items For Duplicate Record

Jul 20, 2005

I have a client who needs to copy an existing sale. The problem isthe Sale is made up of three tables: Sale, SaleEquipment, SaleParts.Each sale can have multiple pieces of equipment with correspondingparts, or parts without equipment. My problem in copying is when I goto copy the parts, how do I get the NEW sale equipment ids updatedcorrectly on their corresponding parts?I can provide more information if necessary.Thank you!!Maria

View 6 Replies View Related

SQL Server 2012 :: Adding Count To Query Without Duplicating Original Select Query

Aug 5, 2014

I have the following code.

SELECT _bvSerialMasterFull.SerialNumber, _bvSerialMasterFull.SNStockLink, _bvSerialMasterFull.SNDateLMove, _bvSerialMasterFull.CurrentLoc,
_bvSerialMasterFull.CurrentAccLink, _bvSerialMasterFull.StockCode, _bvSerialMasterFull.CurrentAccount, _bvSerialMasterFull.CurrentLocationDesc,
_bvSerialNumbersFull.SNTxDate, _bvSerialNumbersFull.SNTxReference, _bvSerialNumbersFull.SNTrCodeID, _bvSerialNumbersFull.SNTransType,
_bvSerialNumbersFull.SNWarehouseID, _bvSerialNumbersFull.TransAccount, _bvSerialNumbersFull.TransTypeDesc,

[code]...

However, as you can see, the original select query is run twice and joined together.What I was hoping for is this to be done in the original query without the need to duplicate the original query.

View 2 Replies View Related

SQL Server 2012 :: Filter Duplicates Based On String?

Feb 19, 2014

CREATE TABLE #Names
( ID INT IDENTITY(1,1),
NAME VARCHAR(100)
)
INSERT INTO #Names VALUES ('S-SQLXX')
INSERT INTO #Names VALUES ('S-SQLXX.NA.SN.ORG')
INSERT INTO #Names VALUES ('S-SQLYY')
INSERT INTO #Names VALUES ('S-SQLYY.NA.SN.ORG')
INSERT INTO #Names VALUES ('S-SQLCL-HR')
INSERT INTO #Names VALUES ('S-SQLCL-MIS')
SELECT * FROM #Names

--I want to filter out S-SQLXX.NA.SN.ORG because S-SQLXX.NA.SN.ORG is a duplicate of S-SQLXX eliminating .NA.SN.ORG from it.

--I want to filter out S-SQLYY.NA.SN.ORG because S-SQLYY.NA.SN.ORG is a duplicate of S-SQLYY eliminating .NA.SN.ORG from it.

--However I want to keep S-SQLCL-HR and S-SQLCL-MIS in my list of names as they do not have .NA.SN.ORG as a part of their name

--I want ONLY these returned IN the SELECT

SELECT * FROM #Names WHERE ID IN (1,3,5,6)
DROP TABLE #Names

View 1 Replies View Related

How To Delete Duplicate Rows Retaining Only One Of Them From Every Set Of Duplicates.

Apr 10, 2008

I have a table employee_test having the sample data. The rows with EmployeeID=6 are duplicate rows. I want to delete the duplicates retaining one row for the employeeid=6.
Note :- I don't want to use a temporary table. I want to do this using a single query or at the most in a SP query batch. Please advise.

EMPLOYEEID

ENAME

SALARY

MANAGERID

1

Anee

1000

11

2

Rick

1200

12

3

JOHN

1100

13

4

ABC

1300

14

5

DEF

1400

15

6

DEF

1400

15

6

DEF

1400

15

View 22 Replies View Related

Duplicate Key Was Ignored Warning Returned Even When No Duplicates Are Found

Sep 27, 2013

I am having problems with the "Duplicate key was ignored" warning message. The problem is that the message seems to happen randomly and cannot be reproduced. If i take the same set of data and run the stored procedure that causes the problem i don't get the warning message a second or subsequent time. Also all the SELECT statements have criteria set to remove duplicates before they are inserted into the tables.

I have a data feed that pulls data from a DB2 database to a SQL Server 2008 staging table as a flattened set of records. A stored procedure in SQL Server is run to load the data into the destination tables. The data feed is run hourly for new and updated records in DB2 Monday-Friday 09:00-17:00 and then there is a midnight run of all the records going back for the last 12 months.

The data feed was originally sent from DB2 as a CSV file and pulled into SQL Server using SSIS but is now an Informatica workflow that pulls the data directly from DB2.

It is the Informatica workflow that is returning the "duplicate key was ignored" warning message and this stops the workflow. The workflow is restarted and the data is always loaded the second time without the warning message. The warning does not happen every time the workflow is run - it can run for a number of days with no warnings and then one will come through I can see in Profiler that it is SQL Server that returns the Duplicate key was ignored warning message so it is not an issue with Informatica.

I cannot reproduce the problem to get to the root cause of the issue. I would expect that if i run the same set of data through the stored procedure i would get the warning message every time, but this is not the case. Even when i step through the stored procedure i do not get the message. As the midnight data feed returns the records from the last 12 months, so by definition would include duplicates, the warning message only appears randomly and is not consistent.

View 8 Replies View Related

SQL Server 2014 :: Duplicate Record Results On 2 One To Many Tables?

Feb 1, 2015

I have 3 Tables

TableA - TAID, Name, LastName
TableB - MaleFriendsName, MaleFriendsLastName [One to many]
TableC - FemaleFriendsName, FemaleFriendsLastName [one to many]

A query returns duplicate results from TableB as well as TableC

TableB and TableC have nothing in common and should not interfere with each other.

with TwoTables as (
select
a.QuickSpec as QuickSpecId,

[Code].....
Resultset returns duplicate values on TableB AND only for 1 record in the results [As per Lynn examples]

View 1 Replies View Related

Finding Duplicate Records

Oct 29, 2007

Hello,

I searched for all the posts which covered my question - but none were close enough to answer what i'm trying to do. Basically, the scenario is thus;

Table1 contains values for UserID, Account code, and Date.

My query (below) is trying to find all the accounts assigned to a particular user ID, but also those duplicate account codes which belong to a second user ID. The date column would be appended to the result set.

The query I'm using is as follows;

select acccountcode, userid, date from dbo.table1
where exists (select accountcode from dbo.table1 where accountcode = table1.accountcode
group by accountcode
having count(*) > 1)
and userid = 'x-x-x'
order by accountcode

What I think this produces is a list of all files where a duplicate exists, but of course it leaves out the 2nd UserID...which is crucial.

Hopefully this makes sense. Any insight my fellow DBA's can share would be greatly appreciated!

Thanks,
D.

View 1 Replies View Related

SQL Server 2012 :: Finding Dependencies Of Objects

Dec 27, 2013

I need to do some clean up activities in my databases... So i intended to drop unwanted tables and views.

for that I need to find, whether the table being used by any other objects ?

How could I achieve it?

View 2 Replies View Related

SQL Server 2012 :: Finding Break In Sequence

Feb 21, 2014

I need to be able to identify breaks in a sequence so I can evaluate the data more correctly. In the sample I have given I need to be able to identify the break in sequence at 69397576, ideally I would set that as a D. My query also needs to recognize that the 3 sequences following 69397576 are sequential and would belong to that set. so the out come would look like this.

id file_name page_follow
693975631555557564_22222221114014810D
693975641555557564_22222221114014810F
693975651555557564_22222221114014810F
693975661555557564_22222221114014810F
693975671555557564_22222221114014810F
693975681555557564_22222221114014810F
693975691555557564_22222221114014810F
693975761555557564_22222221114014810D
693975771555557564_22222221114014810F
693975781555557564_22222221114014810F
693975791555557564_22222221114014810F

here is some test data.
create table test1 (id INT
, [file_name] VARCHAR(100)
, page_follow CHAR(1));
go

[code]....

View 9 Replies View Related

SQL Server 2012 :: Finding Breaks In Key Values?

Jun 12, 2014

The following is sample data I am dealing with.

SELECT * INTO TEMP
FROM
(SELECT 'AAAAA' AS CATEGORY, 'A1000' AS CODE, '01-01-2014' AS STARTDATE, '01-31-2014' AS ENDDATE
UNION
SELECT 'AAAAA' AS CATEGORY, 'A1000' AS CODE, '02-01-2014' AS STARTDATE, '02-28-2014' AS ENDDATE
UNION
SELECT 'AAAAA' AS CATEGORY, 'A1000' AS CODE, '03-01-2014' AS STARTDATE, '03-31-2014' AS ENDDATE
UNION
SELECT 'AAAAA' AS CATEGORY, 'A2000' AS CODE, '04-01-2014' AS STARTDATE, '04-30-2014' AS ENDDATE
UNION
SELECT 'AAAAA' AS CATEGORY, 'A1000' AS CODE, '05-01-2014' AS STARTDATE, '05-31-2014' AS ENDDATE) X

I need to extract the date that the value in CODE column changes to another code for each value of CATEGORY and if there is no change, to record the original CODE value and its startdate for each CATEGORY.

View 3 Replies View Related

SQL Server 2012 :: Finding First And Repeated Values

Aug 26, 2014

I'm trying to come up with a query for this data:

CREATE TABLE #Visits (OpportunityID int, ActivityID int, FirstVisit date, ScheduledEnd datetime, isFirstVisit bit, isRepeatVisit bit)

INSERT #Visits (OpportunityID, ActivityID, FirstVisit, ScheduledEnd)
SELECT 1, 1001, '2014-08-17', '2014-08-17 12:00:00.000' UNION ALL
SELECT 1, 1002, '2014-08-17', '2014-08-17 17:04:13.000' UNION ALL
SELECT 2, 1003, '2014-08-18', '2014-08-18 20:39:56.000' UNION ALL

[Code] ....

Here are the expected results:

OpportunityIDActivityIDFirstVisitScheduledEndisFirstVisitisRepeatVisit
110012014-08-172014-08-17 12:00:00.00010
110022014-08-172014-08-17 17:04:13.00001
210032014-08-182014-08-18 20:39:56.00001
210042014-08-182014-08-18 18:00:00.00010

[Code] ....

Basically I'd like to mark the first Activity for each OpportunityID as a First Visit if its ScheduledEnd falls on the same day as the FirstVisit, and otherwise mark it as a Repeat Visit.

I have this so far, but it doesn't pick up on that the ScheduledEnd needs to be on the same day as the FirstVisit date to count as a first visit:

SELECT*,
CASE MIN(ScheduledEnd) OVER (PARTITION BY FirstVisit)
WHEN ScheduledEnd THEN 1
ELSE 0
END AS isFirstVisit,
CASE MIN(ScheduledEnd) OVER (PARTITION BY FirstVisit)
WHEN ScheduledEnd THEN 0
ELSE 1
END AS isRepeatVisit
FROM#Visits

View 3 Replies View Related

Finding Primary Key Value(s) Having Duplicate Row Information

Feb 23, 2001

Hi,

I am putting my problem in an example as I Feel it would be clear.

Assume my table PEOPLE is having 4 columns with 6 rows, the SlNo being primary key.
SlNo Name LastName birthdate
1 A B x --
2 C B x |-- 1 pair (A, B, x)
3 D E y --|------------
4 A E y | |
5 A B x __| |-- 2'nd pair (D, E, y)
6 D E y ---------------
In this scenario, I need to find SlNo values having similar values in other columns. The o/p for above must be:
1
5
0
3
6
0 (0 needs to include in output for distinction in the sets)

(a)IS THIS POSSIBLE TO DO IN ONE SELECT STATEMET? and HOW?
(b)If I create another temp table tempPEOPLE and select distinct row information of the 2'nd, 3'rd and 4'th columns from the PEOPLE table and
then selecting SlNo's where the information match, I am able to get o/p
1
5
3
6
without 0...and I cannot makeout the distinct sets in this.
HOW DO I FIND THE DISTINCTION IN SETS?

Reshma.

View 1 Replies View Related

Finding A Duplicate Transaction Number

Jan 16, 1999

I have a problem with a 3rd party piece of software. Doesn't matter which, really. The problem lies
in a table called payments, with a column called txnumber...the newest version of this software fails
a check during installation with the message "duplicate txnumber in payment table." Not sure how
this could have happened, since there is no way to manually assign the txnumber, but the point is
not important. What I'd like to do is figure out a sql script that will return only the duplicate number(s)
so that I can either remove or change them manually. Unfortunately, I'm not terribly familiar with sql.

Any suggestions?

View 2 Replies View Related

Optimizing This Duplicate Finding Query

Sep 11, 2005

Hi everyone!

I am parsing a database structed like the following for duplicates.

Code:

keywordnegativeexactbroadphrase
Alpha0.120.36
Alpha0.120.37
Alpha0.120.37
Alpha0.220.37
Alpha0.120.37
Alpha0.120.37
Alpha0.120.37
Beta0.43212
Charlie0.240.31
Delta0.72212
Epsilon410.31
Foxtrot250.22
Grape420.215
Hotel 230.13
Indigo0.3721
Juliet21220.14
Kilroy0.3110.15
Lemon0.2201
Mario0.2502
Nugget0.130.72
Oprah2141
Polo01225
Polo01224
Polo01225
Q-Bert31442
Romeo12100.1
Salty2200.1
Tommy10.30.132.4
Uri10.30.132
Uri20.20.132
Uri20.30.122
Uri20.30.131
Vamprie0.120.1315
Wilco0.210.21
X-Ray00.220
Yeti020.20
Zebra1310

The duplicates that this thread relates to are the kind with duplicate "keyword" entries AND dissimilar field entries; i.e. :

Code:

keyword negative exact broad Phrase
Polo 0 122 4
Polo 0 122 5

I've come up with an SQL query that seems to return all of these duplicates (save one of each type- the 'real', unique entry). However I think I made the query very inefficient. My SQL is very bad; this query will be running over tens of thousands of rows, so if it can be at all optimized I would greatly appreciate your help!

What I have so far is:

Code:

string query1 = "SELECT * FROM TableName" +
" WHERE EXISTS (SELECT NULL FROM TableName" + " b" +
" WHERE b.[keyword]= " + "TableName"+ ".[keyword]" +
" AND b.[negative]<> " + "TableName"+ ".[negative]" +
" ORb.[keyword]= " + "TableName"+ ".[keyword]" +
" ANDb.[exact]<> " + "TableName"+ ".[exact]" +
" ORb.[keyword] = " + "TableName"+ ".[keyword]" +
" ANDb.[broad]<> " + "TableName"+ ".[broad]" +
" ORb.[keyword]= " +"TableName"+ ".[keyword]" +
" ANDb.[phrase]<> "+"TableName"+ ".[phrase]" +
" GROUP BY b.[keyword], b.[broad], b.[exact]" +
" HAVING Count(b.[keyword]) BETWEEN 2 AND 50000)" ;

the algoritm seems to check every column of every row in order to determine a duplicate. Seems straightforward to me, but alas slow...

Is there a better/faster way I can do this? Thanks for you help!

View 1 Replies View Related

Finding Duplicate Foreign Keys

May 29, 2007

Hi

i tried the following query and able to get the list of foreign keys with column names as well as referred tables and referenced column

select parent_column_id as 'child Column',object_name(constraint_object_id)as 'FK Name',object_name(parent_object_id) as 'parent table',name,object_name(referenced_object_id)as 'referenced table',referenced_column_id
from sys.foreign_key_columns inner join sys.columns on (parent_column_id = column_id and parent_object_id=object_id)
Order by object_name(parent_object_id) asc

but i am not able to get the fks created more than once on same column refering to same pk

Thanks in Advance

View 4 Replies View Related

Finding Duplicate Entries (with Different Keys)

Feb 8, 2007

Yet another simple query that is eluding me. I need to find records in a table that have the same first name and last name. Because the table has a primaty key, these people were entered twice or they share the same first and last name.

How could you query this:

ID fname lname
10001 Bill Jones
10002 Joe Smith
10003 Sue Jenkins
10004 John Sanders
10005 Joe Smith
10006 Harrold Simpson
10007 Sue Jenkins
10008 Sam Worden

and get a result set of this:

ID fname lname
10002 Joe Smith
10005 Joe Smith
10003 Sue Jenkins
10007 Sue Jenkins

View 2 Replies View Related