I loaded one table via SSIS and found that it contained many duplicate records (from the input source). I can create a SQL task to delete them, but I wonder if SSIS offers and task "out of the box" to delete dups?
I need a sql statement to delete duplicate records.
I have a college table with all colleges in the nation. I noticed that all of the colleges were listed twice. How do I delete all of the duplicate records.
Here is my table. Colleges ------------------- schoolID - smallint NOT NULL, schoolName - varchar(60) NULL
Can someone help me out with the sql statement??? I'm running SQL Server 6.5.
Hi All, I am having one table named MyTable and this table contains only one column MyCol. Now i m having 10 records in it and all the records are duplicate ie value is 7 for all 10 records.
It is something like this,
MyCol 7 7 7 7 7 7 7 7 7 7
Now i m trying to delete 10th record or any record then it gives me error "Key column information is insufficient or incorrect. Too many rows were affected by update."
What should i do if i want only 4 records insted 10 records in my table? How do i delete the 6 records from table?
Rajarajan writes "Kindly don't ignore this as regular case. This is peculiar. I need to delete one of duplicate records only if they occurs consecutively. eg.
1. 232 2. 232 3. 345 4. 567 5. 232
Here only the first record has to be delete. Kindly help me out.
I managed to find the 'Deleting Duplicate Records' from SQLTeam.com (thanks, by the way!!).. I managed to modify it for one of my tables (one of 14).
-- Add a new column
Alter table dbo.tblMyDocsSize add NewPK int NULL go
-- populate the new Primary Key declare @intCounter int set @intCounter = 0 update dbo.tblMyDocsSize SET @intCounter = NewPK = @intCounter + 1
-- ID the records to delete and get one primary key value also -- We'll delete all but this primary key select strComputer, strATUUser, RecCount=count(*), PktoKeep = max(NewPK) into #dupes from dbo.tblMyDocsSize group by strComputer, strATUUser having count(*) > 1 order by count(*) desc, strComputer, strATUUser
-- delete dupes except one Primary key for each dup record deletedbo.tblMyDocsSize fromdbo.tblMyDocsSize a join #dupes d ond.strComputer = a.strComputer andd.strATUUser = a.strATUUser wherea.NewPK not in (select PKtoKeep from #dupes)
-- remove the NewPK column ALTER TABLE dbo.tblMyDocsSize DROP COLUMN NewPK go
drop table #dupes
Now that I've got that figured out, I need to write the same thing to fix the other 13 tables (with different column info)- and I'll need to run this daily.
Basically I've put together some vbscript that gathers inventory data and drops it into an MSDE db (sorry - goin for 'free' stuff right now). Problem is it has to run daily so that I'm sure to capture computers that turned on at different times etc which ever-increases my database 'till I bounce off the 2GB limit of MSDE.
So the question is, what would be the best way to do this? Can I put the code into a stored procedure that I can execute each day?
I have an SQL tables [Keys] that has various rows such as: [ID] [Name] [Path] [Customer] 1 Key1 Key1 InHouse 2 Key2 Key2 External 3 Key1 Key1 InHouse 4 Key1 Key1 InHouse 5 Key1 Key1 InHouse
Obviously IDs 1,3,4,5 are all exactly the same and I would like to be left with only: [ID] [Name] [Path] [Customer] 1 Key1 Key1 InHouse 2 Key2 Key2 External
I cannot create a new table/database or change the unique identifier (which is currently ID) either. I simply need an SQL script I can run to clean out the duplicates (I know how they got there and the issue has been fixed but the Database is still currently invalid due to all these duplicate entires).
I have an SQL tables [Keys] that has various rows such as: [ID] [Name] [Path] [Customer] 1 Key1 Key1 InHouse 2 Key2 Key2 External 3 Key1 Key1 InHouse 4 Key1 Key1 InHouse 5 Key1 Key1 InHouse
Obviously IDs 1,3,4,5 are all exactly the same and I would like to be left with only:
I cannot create a new table/database or change the unique identifier (which is currently ID) either. I simply need an SQL script I can run to clean out the duplicates (I know how they got there and the issue has been fixed but the Database is still currently invalid due to all these duplicate entires).
Hi , i am using sql server 2005. i have one table where i need to find records that have same citycode and hospitalcode and doctorcode then delete the record keeping only one record of them my problem is table structure have idendtity column which is unique. that is m table structure is something like
I have a situation where deleting old records is blocking updating latest records on highly transactional table and getting timeout errors from application.
In details, I have one table called Tran_table1 in OLTP database. This Tran_table1 is highly transactional table, it will receive data for insert/update continuously
While archiving 2 years old records from Tran_table1 into Tran_table1_archive in batches(using DELETE OUTPUT INTO clause), if there is any UPDATEs on Tran_table1,these updates are getting blocked and result is timeout errors in application.
Is there any SQL Server hints to avoid blocking ..
I was wondering if anyone had a suggestion as to how to delete duplicate rows from a table. I have been doing this:
SELECT * INTO TempUsersNoRepeats FROM TempUsers2 UNION SELECT * FROM TempUsers3
This way I end up with a total of four tables (the fourth table being the original Users table) and I was hoping that there was a way that I could do this all within the the original Users table and not have to create the three TempUsers tables.
Hi, i need the suggestion here in very familiar db situation ..i have a main table and a primary key of that table is used in many other table as foreign key.If i am deleting a record in a main table,how do i make sure that all the corresponding record in the associated tables,where that foreign key is used, gets deleted too?What are my options?Thanks
I'm trying to delete some records from some tables in a SQL Server 2008 R2 database. There's a foreign key relationship between the two tables. To make things easier here's the definition of both tables:
-- Parent table CREATE TABLE [dbo].[PharmInvInItemPackages]( [InventoryInDetailID] [int] IDENTITY(1,1) NOT NULL, [InventoryInID] [int] NOT NULL, [ItemPackageID] [int] NOT NULL,
Is there a way to delete records from table passing parameter as tablename? I won't to delete all records from a table dependent on table selected. i'm trying to do this with stored procedure...Table to delete depends on the checkbox selected.
Current code(works) Public Function DelAll() MZKDB = MZKHRFin If sloption = "L" Then sqlConn.ConnectionString = "Server=" & MZKSrv & ";Initial Catalog=" & MZKDB & ";Integrated Security=SSPI;" ElseIf sloption = "S" Then sqlConn.ConnectionString = "Server=" & MZKSrv & ";User id=sa;Password=" & MZKPswd & "; Initial Catalog=" & MZKDB & ";" End If sqlConn.Open() sqlTrans = sqlConn.BeginTransaction() sqlCmd.Connection = sqlConn sqlCmd.Transaction = sqlTrans Try sqlCmd.CommandText = sqlStr sqlCmd.ExecuteNonQuery() sqlTrans.Commit() frm2.txtResult.Text = frm2.txtResult.Text & " " & TableName & " Prior records have been deleted from the database." & vbCrLf SetCursor() Catch e As Exception Try sqlTrans.Rollback() Catch ex As SqlException If Not sqlTrans.Connection Is Nothing Then frm2.txtResult.Text = frm2.txtResult.Text & " " & TableName & " An exception of type " & ex.GetType().ToString() & " was encountered while attempting to roll back the transaction." & vbCrLf SetCursor() End If End Try frm2.txtResult.Text = frm2.txtResult.Text & " " & TableName & " Records were NOT deleted from the database." & vbCrLf SetCursor() Finally sqlConn.Close() End Try ResetID() End Function If cbGenFY.Checked Then sqlStr = "DELETE FROM FIN_FiscalYear" TableName = "dbo.FIN_FiscalYear" DelAll() ClearCounts() timeStepStart = Date.Now GenFY() timeStepStop = Date.Now
DispOneCounts() End If If cbGenFund.Checked Then sqlStr = "DELETE FROM FIN_Fund" TableName = "dbo.FIN_Fund" DelAll() ClearCounts() timeStepStart = Date.Now GenFund() timeStepStop = Date.Now DispOneCounts() End If If cbGenFunc.Checked Then sqlStr = "DELETE FROM FIN_Function" TableName = "dbo.FIN_Function" DelAll() ClearCounts() timeStepStart = Date.Now GenFunc() timeStepStop = Date.Now DispOneCounts()
End If If cbGenObject.Checked Then sqlStr = "DELETE FROM FIN_Object" TableName = "dbo.FIN_Object" DelAll() ClearCounts() timeStepStart = Date.Now GenObject() timeStepStop = Date.Now DispOneCounts() End If If cbGenCenter.Checked Then sqlStr = "DELETE FROM FIN_Center" TableName = "dbo.FIN_Center" DelAll() ClearCounts() timeStepStart = Date.Now GenCenter() timeStepStop = Date.Now DispOneCounts() End If If cbGenProject.Checked Then sqlStr = "DELETE FROM FIN_CodeBook" TableName = "dbo.FIN_CodeBook" DelAll() sqlStr = "DELETE FROM FIN_BudgetAccnt" TableName = "dbo.FIN_BudgetAccnt" DelAll() sqlStr = "DELETE FROM FIN_Budget" TableName = "dbo.FIN_Budget" DelAll() sqlStr = "DELETE FROM FIN_Project" TableName = "dbo.FIN_Project" DelAll() ClearCounts() timeStepStart = Date.Now GenProject() timeStepStop = Date.Now TableName = "dbo.FIN_Project" DispOneCounts() End If If cbGenProgram.Checked Then sqlStr = "DELETE FROM FIN_Program" TableName = "dbo.FIN_Program" DelAll() ClearCounts() timeStepStart = Date.Now GenProgram() timeStepStop = Date.Now DispOneCounts() End If If cbGenGL.Checked Then sqlStr = "DELETE FROM FIN_gl" TableName = "FIN_gl" DelAll() ClearCounts() timeStepStart = Date.Now GenGL() timeStepStop = Date.Now DispOneCounts() End If If cbGenRevenue.Checked Then sqlStr = "DELETE FROM FIN_Revenue" TableName = "FIN_Revenue" DelAll() ClearCounts() timeStepStart = Date.Now GenRevenue() timeStepStop = Date.Now DispOneCounts() End If If cbGenBank.Checked Then sqlStr = "DELETE FROM FIN_VendorBankAccnt" TableName = "dbo.FIN_VendorBankAccnt" DelAll() sqlStr = "DELETE FROM FIN_VendorBank" TableName = "dbo.FIN_VendorBank" DelAll() sqlStr = "DELETE FROM FIN_bankAdd" TableName = "dbo.FIN_bankAdd" DelAll() sqlStr = "DELETE FROM FIN_bankTERMS" TableName = "dbo.FIN_bankTerms" DelAll() sqlStr = "DELETE FROM FIN_bank" TableName = "dbo.FIN_bank" DelAll() ClearCounts() timeStepStart = Date.Now GenBank() timeStepStop = Date.Now TableName2 = "dbo.FIN_bankTERMS" TableName3 = "dbo.FIN_BankAdd" TableName4 = "dbo.FIN_VendorBank" TableName5 = "dbo.FIN_VendorBankAccnt" DispTwoCounts() End If If cbFinAP.Checked Then sqlStr = "DELETE FROM FIN_Period" TableName = "FIN_Period" DelAll() ClearCounts() timeStepStart = Date.Now GenPeriod() timeStepStop = Date.Now DispOneCounts() End If
I have a query that is executing properly but i want to delete the results of the query. I am trying to do it but i am messing up somewhere in the syntax
Can anybody help me out with this problem? Below is the query DELETE FROM DPT_NEW_BINS WHERE( Select BIN,LEFT(ACCT_NUM_MIN,16), LEFT(ACCT_NUM_MAX,16), ISO_CTRY_CD, REGN_CD ,PROD_TYPE_CD FROM DPT_NEW_BINS o WHERE EXISTS (SELECT BIN,LEFT(ACCT_NUM_MIN,16), LEFT(ACCT_NUM_MAX,16), ISO_CTRY_CD, REGN_CD ,PROD_TYPE_CD FROM DPT_temp_NEW_BINS i WHERE o.ACCT_NUM_MIN = i.ACCT_NUM_MIN AND o.ACCT_NUM_MAX = i.ACCT_NUM_MAX))
No transaction log involved, only the table itself.
Use sp_spaceused "table_name" to check the space used.
It seems the table size actually increased from the beginning to the middle of deletion, at the end of deletion, its size decreased.
Recovery mode set to be simple, autoshrink turned on.
The tables tested are about 50MB ~ several GB in size, all have the same behavior. The size increased about 5%~10%.
Since the deletion is called from another software, I want to know if it is possible for SQL Server to have this behavior or it is absolutely the 3rd party software's issue
I have many large tables with millions of records in a SQL Server database. They all use an Identity column which is the clustered index. We haven't been deleting any records until recently because disk space is now becoming a problem.
Assuming I delete a lot of old records, I'm thinking that the freed up space in the data pages won't be reused. New records will be added at the end of the table because of the Identity column being the clustered index. So the table will keep getting bigger even though there is lots of free space.
Assuming I'm right, then how do I recapture this unused space? Is an alter table rebuild the best way? Or rebuilding the clustered index?
How do i remove duplicate records from a table with a single query without using cursors or anything like that.Sample :tempCol11221P.S The table has only one column
Hello friends, I have a one problem, i have a table in that some reocrds are duplicate.i want to find which records are duplicate. for exp. my table is as follows emp_id emp_name 1 aa 2 bb 3 cc 1 aa 3 cc 3 cc and i want the result is like emp_id emp_name 1 aa 1 aa 3 cc 3 cc 3 cc
I have a table that has duplicate records with the exception of the ID and I am trying to write a while loop that would go through the table, locate all duplicate records based on a field called LASTNAME.
I just discovered that all my records appear twice inside my table, inother words, they repeat on the row below. How can I delete all of theduplicates? I'm sure there must be a tidy line of sql to do that.Thanks,Bill
I uploaded some data about 2 or 3 times and it keep appending it to thetable.Now I want to keep only first duplicate and delete rest of.Suppose part number 123 has been added 3 times so I want to keep only 1record.Thanks
Hello Frnds....Can anybody give the answer of this question as How to Delete duplicate records from Table ? I Know that with check option and also with Unique Constraint we can avoid to enter duplicate records in table but How to delete from table which does not have any constraints ?
any useful SQL Queries that might be used to identify lists of potential duplicate records in a table?
For example I have Client Database that includes a table dbo.Clients. This table contains various columns which could be used to identify possible duplicate records, such as Surname | Forenames | DateOfBirth | NINumber | PostalCode etc. . The data contained in these columns is not always exactly the same due to differences caused by user data entry; so some records may have missing data from some of the columns and there could be spelling differences too. Like the following examples:
1 | Smith | John Raymond | NULL | NI990946B | SW12 8TQ 2 | Smith | John | 06/03/1967 | NULL | SW12 8TQ 3 | Smith | Jon Raymond | 06/03/1967 | NI 99 09 46 B | SW12 8TQ
The problem is that whilst it is easy for a human being to review these 3 entries and conclude that they are most likely the same Client entered in to the database 3 times; I cannot find a reliable way of identifying them using a SQL Query.
I've considered using some sort of concatenation to a new column, minus white space and then using a "WHERE column_name LIKE pattern" query, but so far I can't get anything to work well enough. Fuzzy Logic maybe?
the results would produce a grid something like this for the example above:
ID | Surname | Forenames | DuplicateID | DupSurname | DupForenames 1 | Smith | John Raymond | 2 | Smith | John 1 | Smith | John Raymond | 3 | Smith | Jon Raymond 9 | Brown | Peter David | 343 | Brown | Pete D next batch of duplicates etc etc . . . .
I want to delete duplicate row from a very big table. Actually this table is used by a SP, and it's a very important. Due to duplicate record entry it's falling. I use bellow method for discarding the dulicate record.
PLz tel me it's the most efficent way to this job or u have some other way
1. I drop the primary key
2. Then I let all the duplicate record came into the table
3. then I removed them by using Group by clause and setting rowcount(1 - group by count).
4. Put primary key back and update the statistics.
Code is
If Exists (select * from SYSINDEXES where name='PrimaryKey' and id=Object_id('AdjustmentTransactions')) DROP INDEX AdjustmentTransactions.PrimaryKey
Insert into AdjustmentTransactions (UrnABS, UrnBar, .................Description) Select a.UrnAbs, a.UrnBar, a.TxnUrn..................... a.Description from TxnProcess a where a.InsUpFlag = 'I' and a.Processed = 'N' and ASCII(b.TxnType) = 65
Set @row_count=0 Declare dup_cursor cursor for Select UrnBAR,TxnUrnBarPat,count(*) counts from AdjustmentTransactions group by UrnBAR,TxnUrnBarPat having count(*) > 1 Open dup_cursor Fetch next from dup_cursor into @VUrnBAR,@VTxnUrnBarPat,@count
While (@@Fetch_Status = 0) Begin Select @row_count=@count-1 Set rowcount @row_count Delete from AdjustmentTransactions where UrnBAR=@VUrnBAR and TxnUrnBarPat=@VTxnUrnBarPat Fetch next from dup_cursor into @VUrnBAR,@VTxnUrnBarPat,@count End Set rowcount 0 Close dup_cursor Deallocate dup_cursor If not Exists (select * from SYSINDEXES where name='PrimaryKey' and id=Object_id('AdjustmentTransactions')) CREATE UNIQUE INDEX [PrimaryKey] ON [dbo].[AdjustmentTransactions]([UrnBAR], [TxnUrnBarPat]) ON [PRIMARY] Update Statistics AdjustmentTransactions
I have a csv file that I need to import daily into a SQL Server 2005 table. Much of the table contents could just be overwritten with the new csv file, however there are a set of Rows within the table that need to be appended to , rather than overwritten. There is no Primary Key in the csv file that can be used. I'm not sure this is the best approach, but what I have been trying to do, is append the entire csv file to the existing table, and then go back and delete the duplicates. When I run the Delete, it does delete the majority of the records, but leaves a couple hundred behind. The number left behind varies with each run, can't seem to identify a pattern here. Running the Delete a second time does clean up the rows left behind in the first execution of the Delete, and gives the result I want. Any thoughts as to why this needs to be run twice? Or is a better approach available? Here is my code - SELECT [Pkg ID], [Elm (s)], [Type Name (s)], [End Exec Date], [End Exec Time], dupcount=count(*) INTO temppkgactions FROM pkgactions GROUP BY [Pkg ID], [Elm (s)], [Type Name (s)], [End Exec Date], [End Exec Time]HAVING count(*) > 1
DELETE TOP (SELECT COUNT(*) -1 FROM dbo.temppkgactions WHERE dupcount > 1 ) FROM dbo.pkgactions DROP TABLE temppkgactions
I have problem in deleting duplicate rows. I have a identity column in my table, if I try to use correlatted sub query with Delete command it gives error.
The other problem I have is I have a date column in my table and update that column with current date and time. If use a query to fetch a records on a particular day , it does not return any rows
select * from rates where ch_date >='02/11/99' and ch_date<='02/11/99'
If I use convert also there is some other problems. Is there any way to force date checkings to be done excluding time.
This is an imaginary problem while discussing ROWID in ORACLE.
Consider a table without primary key, unique key, uniuqe index. A row has inserted into the table many times. I want to delete all but one dulicated rows. With any 'where' clause all rows(duplicated) will be deleted. In ORACLE i can achieve this using ROWID as follows:
Delete from Table_name where < all column values > and ROWID <> ( Select max(rowid) from Table_name where < all column values > )
How can this be achieved in MS SQL Server 6.5 ?
According to Dr. Codd's Golden rules for RDBMS one is that One should be able to reach each data value in the database by using table name, row idenfication value and column name.
Does MS SQL Server 6.5 satisfy this requirement ?
Also How many of Dr. Codd's 13 Golden Rules for RDBMS does MS SQL Server 6.5 Satisfy? Which doesn't ?