Tracking Forums, Newsgroups, Maling Lists
Home Scripts Tutorials Tracker Forums
 
  HOME    TRACKER    MYSQL




Count(*) To Avoid Duplicates


This is for (2) seperate sites that share the same login table (same username/passwords will access both sites).

Users may register at either site, if they choose, then they should login at the other site to "complete" the last phase of registration at the other site to insert remaining needed values.

Problem is with duplicates. When some users won't login, as they should, "complete" the process at the other site. They may bypass and register anew like anyone else at the site for the first time. (Even though there is a pre-registration page at both sites -- asking if users have already registered at the other site prior to arriving at the second -- directing a login to complete the process.)

Reason this is a concern is that one site has a (6) page registration process, inserting to (8) tables.

My idea (aside from aleady checking for unique inserts on usernames/passwords/emails) is to check on the complete phone number.
To be used on a conditional show region on the second page of registration -- if the latest registrant entered the same phone number asmay aleady exist in the table, (previously enterered while registering on the other site and forgot about it) -- this 2nd registration page will not show due to the: count(*)

Starting with 'Area' code, is this practical/how to combine all three?
Anyone have better suggestions or experience on handling two site duiplicates?

Would there be any benefit in having ALL USERS (both new and prior from the "other" site) fill out the the 1st registration form (on the (6) page registration site) where they are then directed to login to complete the process?

'AREA' 'PRE' 'PHONE'

SELECT *
FROM `Members`
WHERE `Area`
IN (
SELECT `Area`
FROM 'Members`
GROUP BY `Area`
HAVING count(*)>1
)

note: some registrants have same company names fir satellite/regional offices




View Complete Forum Thread with Replies

See Related Forum Messages: Follow the Links Below to View Complete Thread
Getting COUNT Without Duplicates
I was just reading about mysql_calc_rows, which I cannot rely on since I am stuck on MySQL 3.23.

I have a query that grabs products from a category and all of its subcategories. The category tree is maintained as a preorder tree with each node having a left and right value (makes it easy to grab all products in all subcategories).

Here is the relevant setup:

create table categories_list(
id INT NOT NULL PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(50) NOT NULL,
lval INT NOT NULL,
rval INT NOT NULL
);

create table product_categories(
prod_id INT NOT NULL,
cat_id INT NOT NULL,
PRIMARY KEY(prod_id, cat_id)
);

create table products(
id INT NOT NULL PRIMARY KEY AUTO_INCREMENT,
[product fields]
);
Products can belong to more than one category. Here is the query I use to get all products contained within the current category tree.



SELECT DISTINCT $_PRODUCT_FIELDS
FROM
product_categories as pc
LEFT JOIN
products as p ON p.id=pc.prod_ID
INNER JOIN
categories_list as cl ON cl.id=pc.cat_id
WHERE
cl.lval >= $parent_left AND
cl.rval <= $parent_right
LIMIT $start_row, $rows_per_page
If I didn't have that LIMIT clause in there, I would be happy to just use the PHP function for counting num rows returned. I want to show something like "Displaying rows 11-20 of 512". So I need to know how many rows the query would return without the LIMIT clause.

I suppose I could run the query twice, once without the LIMIT, and then count all the rows, but with 20,000 products in the database I don't really want to shuffle all that data between PHP and MySQL.

I would do something like:


SELECT COUNT(*)
FROM
product_categories as pc
LEFT JOIN
products as p ON p.id=pc.prod_ID
INNER JOIN
categories_list as cl ON cl.id=pc.cat_id
WHERE
cl.lval >= $parent_left AND
cl.rval <= $parent_right
That will give me a different number though, because duplicate rows will be counted -- if a product is in more than one category, it will be counted twice.

I may have made some typos in this post (please point out if something looks broken) but the query is working fine. I just want a way to count the rows without doing the query all over..

How To Find Duplicates In Two Columns And Count For Each Row Found
I have a little advanced task at hand. I want to search in a table, and find duplicates and count each duplicate found.

Imagine this table

Table | Column Name

Userinfo
Firstname | Lastname

In this table there are many with the same firstname and lastname. I would like an sql statement which find these duplicates and for each duplicate also writes how many occurences it found. So the output I would like is something like this:

John Smith 5
Eva Something 8
and so on.

Avoid Adding More Than Once
im trying to do what i think is a fairly easy query but im having some problems.

basically i want to only include a certain row of data in the results if it has previously met a certain criteria. however if it has met the criteria more than once i still only want it to return once.

Avoid Self-joins
I have a table that has values of variables for certain entities. The
columns of interest are targetID, variableID, and valueID. A row (1, 5,
9) means that target number 1 has a value of 9 for variable 5. Being
denormalized, target number one will have many possible rows in this
table, one for each variable for which it has a value.

My problem occurs when I want to find out what targets match a certain
set of variable values. For instance, I want to find out what targets
have a value of 9 for variable 5 and a value of 25 for variable 10. I'm
thinking that this can be a simple self-join:

SELECT mya.targetID from mytable as mya
LEFT JOIN mytable as myb
ON mya.targetID=myb.targetID
WHERE (mya.variableID=5 AND mya.valueID=9)
AND (myb.variableID=10 AND myb.valueID=25)

Does this make sense so far? The problem is that this doesn't scale.
When I have more than 31 variables and I need to evaluate them all,
MySQL breaks: I can't do more than 31 joins.

My design calls for perhaps 80-100 variables, so even 64-bit
architecture with a limit of 64 joins won't get me there. This is NOT
an architecture or platform issue - I need a design and a data
structure that will scale to lots of variables.

I need another data structure that won't get me stuck on too many
joins.

How To Avoid Race Condition?
How do I lock a table for one of my insert (followed by a read) queries on a table such that other simultaneous insert/read queries on that table are put off until the first one is complete? I am trying to avoid a potential race condition.

Avoid Zero In Interger Datatype
Is there any way to avoid zeros which automatically entered in database in those column those have integer data type.

or i want to enter - instead of those zero

How To Avoid Previous Results
I want to omit all results(id) from the first query in 2nd query , without using sub query...

How To Avoid Duplicate Records
I need to filter out duplicates for every 30 seconds. say i have two duplicate records within the 30 seconds limit. I need to show up only one. If there are identical records but with a different time settings(say above 30 seconds) then i need to display it. I need to restrict duplicate records within 30 seconds.

How To Avoid NOT EXISTS In MySQL 4.0.26
I tried to use NOT EXISTS but recently found out that it is supported since 4.1. Now I'm trying to avoid it but I can't figure out how to do this. The Problem is that I have a table with (among outhers) a column SessionId and action.

Action may be opened and closed. Now I want to get all those sessions which are in state open, i.e. which have no line with action = closed. My first attempt was:

select sessionId
from audit AS a
where action = 'opened' AND
NOT EXISTS (SELECT * FROM audit AS b WHERE b.sessionId = a.sessionId AND b.action = 'closed')

Could anybody give me a hint how to get an equivalient query without a subquery?

How To Avoid Repeat Typing
I would like to know how to avoid repeating
typing an SQL statement when an error occurs after execution. That is,
if an error occurs I should be able to retrieve the statement that I
had written and correct the mistake. It is agonizing to keep on
repeating a statement that can take five minutes to write just because
one mispelled a word or missed a comma. I use windows 98.

Avoid Ordering When Using GROUP BY
I have a table Orders:

Id | Customer
---+---------
1 | Smith
2 | Smith
3 | Johnson
4 | Smith
5 | Smith

When using query:

SELECT GROUP_CONCAT(O.Id ORDER BY O.Id SEPARATOR ',') Id, O.Customer Customer
FROM (SELECT * FROM Orders ORDER BY Id) O
GROUP BY O.Customer
ORDER BY NULL;

I consider to get:

Id | Customer
----+---------
1,2 | Smith
3 | Johnson
4,5 | Smith

but instead of this I get:

Id | Customer
--------+---------
1,2,4,5 | Smith
3 | Johnson

How can I get considered result?

Avoid Repeat Typing
I am a newbie in MySQL. I would like to know how to avoid repeating
typing an SQL statement when an error occurs after execution. That is,
if an error occurs I should be able to retrieve the statement that I
had written and correct the mistake. It is agonizing to keep on
repeating a statement that can take five minutes to write just because
one mispelled a word or missed a comma.

Command Used In Avoid Retyping
I don't want to retype a command once I wrote it. Does anyone knowa about an option that would just that like in a dos shell where you can use arrows to get back any command you have entered.

Avoid Couples In The Resultset
I use a query like this:
select t1.id_topics, i1.id_indices , t2.id_topics from indices i1, indices i2, topics t1, topics t2
where (i1.ind ='test' and i1.id_urls=t1.id_urls and i2.id_urls=t2.id_urls and i1.ind=i2.ind and t1.id_topics<>t2.id_topics)

it gives results like those:
"id_topic1","id_indices","id_topic2"
36,682,34
37,682,36
36,682,37
37,682,34

I would like to eliminate the "inverted" mates of couples in the resultset. That means in the example
"36,682,37" should be eliminated since "37,682,36" is already part of the resultset. Is it possoble to express this request in the query?

Avoid Duplicate Records In Within 30 Seconds
I'm working with php. I have a auction site, more or less. I want to create all-time rankings. The idea is to display where a seller ranks (all time) in the number of sales.

So for example, I'd display
John Doe All Time Sales Ranking: #138

I'm not exactly sure how to go about this.

$query = "SELECT count(*) as counter, SellerName FROM sales GROUP BY Sellername ORDER BY counter DESC";

This query would give me the data to list all of the Sellers in desc order by the number of sales.

In php, I could probably count until the Sellername was equal to $Sellername (already defined in their profile page), but I was hoping there would be a way to do this entirely in MySQL.


Trying To Avoid Using Query Inside While Loop
I've done a lot of reading on here and I learned from some of Rudy's posts that it's a bad idea to do a query inside a while loop. I don't know why this is but he seems like an expert so I'll listen

What I am trying to do is take the top ten points from a player and display them. First I will post the tables then the code.

CREATE TABLE `tournament` (
`gameid` int(11) unsigned NOT NULL auto_increment,
`gametype` tinyint(2) unsigned NOT NULL default &#390;',
`gamedate` datetime NOT NULL default &#55612;&#57200;-00-00 00:00:00',
`leagueid` int(11) unsigned NOT NULL default &#390;',
`seasonid` int(11) unsigned NOT NULL default &#390;',
`roomid` int(11) NOT NULL default &#390;',
`gamename` varchar(50) NOT NULL default '',
`cost` mediumint(4) NOT NULL default &#390;',
`seats` mediumint(4) NOT NULL default &#390;',
`notes` tinytext NOT NULL,
PRIMARY KEY (`gameid`)
) ;

CREATE TABLE `tournament_results` (
`id` bigint(20) unsigned NOT NULL auto_increment,
`gameid` int(11) unsigned NOT NULL default &#390;',
`memberid` int(11) NOT NULL default &#390;',
`place` smallint(4) NOT NULL default &#390;',
`earnings` smallint(5) default &#390;',
`bounties` float unsigned default &#390;',
`points` float NOT NULL,
PRIMARY KEY (`id`)
);
Now the follow code would be what I would use if I just wanted to add all the points and not limit it

PHP

mysql_query("SELECT
members.firstname,
members.lastname,
SUM(tournament_results.points) AS tpoints,
COUNT(*)as numgames,
AVG(tournament_results.points) AS apoints,
members.memberid AS membersid
FROM
members,
tournament_results,
leaguemembers, tournament
WHERE members.memberid = tournament_results.memberid
AND leaguemembers.memberid = members.memberid
AND leaguemembers.leagueid = '$lid'
AND tournament_results.gameid = tournament.gameid
AND tournament.seasonid = '$sid'
GROUP BY members.memberid
ORDER BY tpoints DESC");

That will total all of their games but I want to limit it their top 10 scores how can I do this without introducing a query inside of the while loop?

How To Avoid Having Thousands Of Records In A Many-to-many Relation
I'm building a web administration system for my company. We keep all our contacts from other organizations in this system (stored in a MySQL database): name, addres, telephone etc.

One feature of the system is that you can select a number of contacts and collectively send them an email. This works fine.

But: some of my co-workers need to know which of the contacts has received a specific email from the system.
So I was considering a setup like the following:

Table one: contacts (name, addres etc.)
Tabel two: emails (subject, text, creation day, day of delivery etc.)
Table three: many-to-many table holding one row for each email that has been send to a specific contact (autonum ID, ID of email and ID of contact).

My problem is that I can predict that this table will have LOTS of records in no time, as my co-workers are sending out many emails to a lot of contacts.

So: are there any better ways?
I thought about storing all contact IDs that has received a specific email in a text field in that email's record in the database - but then I'm not sure of the performance when I have to find out if someone specific has gotten a specific email etc.

Cleaning Up MySQL Connections To Avoid 1040
Well, I'm trying to run
PHP

$result = mysql_query("SHOW FULL PROCESSLIST");
while ($row=mysql_fetch_array($result)) {
    $process_id=$row["Id"];
    if ($row["Time"] > 200 ) {
        $sql="KILL $process_id";
        mysql_query($sql);
    }
}


To clean up my connections, as I'm getting a 1040 error "too many connections". Of course, I can't run this until I can actually connect, unless there is a way around somehow. I don't have any admin rights, I just have a web-based "php my-admin" module to run the db.

Every page people access opens a mysql connection, and then it is closed with
PHP

mysql_close($connection);


. Would putting in
PHP

<?php mysql_close($connection); $NASI_connection = null;?>


Mysql Running All Queries Double ? How To Avoid ?
Sometimes, when my database server has been slow and building up hundreds of queries, I have the impression that mysql is running 'double' of triple . It's hard to explain, but when I look at mtop (a tool to see what queries are active and what time they take to finish) , I see a lot of queries that seem to hang and that are present more then once. Some of them have unique data in them that can not happen just by refreshing a form. When the peak moment is over, and I check my site, then I sometimes see for example the exact same forum post 3 or 4 times in the same thread. Needless to say, this causes a lot of extra load on my database.

I'm not sure what really happens, but it seems like mysql server is just piling up queries wich it can not process fast enough, and those queries seem to come in several times again. I use seperate webservers, but even if I reboot them all, the extra queries still come in as soon as the webserver is back up and php scripts start to work again. The only way to stop it all, is to restart mysql, but that is something that I can not just do very easily. I rely on several heap tables for my site and they need to be refilled with data from normal tables. So restarting mysql is something I only like to do at night, with very few visitors online.

I hope I'm explaining my problem well enough. And I hope there is a way for me to check what really is going on, and if there is a way to stop mysql from 'running double' ?

What Field Type To Use To Avoid Blank Spaces In Fields
Can you tell me the best field type to use here?

I've got a table in mysql with all 5 fields defined
as tinytext

Problem is when I export this to to a text file for notepad
each field is padded out by several blank spaces,
and i think my eamil program doesnt like this type of structue :
field1 , field2 , field3 , field4

Scheduling Replication To Avoid Bottle-neck Updates
We have a circular master-slave setup where any one of the 2 servers
can become master at any time (by human decision). The two servers are
placed at geographically different sites. The servers contain à number

of databases which are all replicated both ways.

When we have full usage of one master ~500 inserts/updates per second,
the bandwidth between our sites becomes a significant bottle-neck. This

we can accept at database level not on server level, ie
- if database A on site B has a lag of 30min because of important
activity on database A on site A, it is acceptable.
- if database B on site B has a lag of 30min because of important
activity on database A on site A, it is not acceptable.

Is there a work-around? We never have updates concerning 2 databases in
the same query.

Creating multiple mysql servers at each site could be one, but that
means some 10-50 servers on every physical computer. What side-effects
does that create?

Get All Duplicates
I just wrote this query that I thought should return all the duplicate rows but it just runs and never returns a result and I ahev to restart apache.

SELECT * FROM `usertable` WHERE `email` IN (SELECT email FROM `usertable` WHERE user_group=4 GROUP BY email HAVING count(*) >1) AND user_group=4
The piece of code "SELECT email FROM `usertable` WHERE user_group=4 GROUP BY email HAVING count(*) >1" returns all emails that are duplicates.

My goal is to export a csv of all rows that are duplicates, not just the single emails that are duplicates. If there are 3 rows of the same email I want to return all 3 rows.

Duplicates
Let's say I have a table of users, and each user has a list of
categories. I could store each user's categories as TEXT with
delimeters like "cat1|cat2|cat3"

But then I need to be able to get a full list of everyone's categories,
without duplicates. Retrieving all the categories, exploding them, and
then removing the duplicates is a bit slow. Is there a better method?

Searching For Duplicates
I have a table named "article" with a mere 5,000 rows. I would like to count the duplicate titles. The following query just hangs (or at least takes longer than 10 minutes as I killed it at that point):

SELECT count(*) AS article_count
FROM article AS a1
LEFT JOIN article AS a2 ON a1.title=a2.title

Titles could be as long as 150 characters. The column named "title" is full text indexed. What am I doing wrong?

I would also like to check the article bodies for duplicates. The column named "body" could be as long as 10,000 characters and is full text indexed. Whatever solution there is for title duplicates, I would like it to also work for body duplicates. Finally, the article table is going to get a lot bigger. It needs to work with 100,000 rows or more.

Only Show Duplicates Once
is there any way of showing duplicates only once ?

-- let's say, i have a database of cars, and they search by year...

so, i create a select form that contains the years, but, the years can't show twice or more in that select form..

Preventing Duplicates
I have a table that contains rows of department data, where each department has a deptcode, a status and some other arbitrary data, there can be a maximum of three rows with the same deptcode but each must have a different status i.e. live, pending or old.

I have written the a sql update that changes the status from pending to live whilst simultaneously changing the status of the live version to old:

UPDATE RECMGR_EL_DEPT_Query SET STATUS =
case STATUS
when 'live' then 'old'
when 'pending' then 'live'
else STATUS
end
Where DEPTCODE = <dtml-sqlvar DEPTCODE type="string">;

However if I already have a department record with a status of old (from a previous update) when I run this sql I get another row with the same deptcode also having the status of old. So my query is, is there a way I can test for the existance of a row with the required deptcode and the status of old and delete it prior to running this sql update?

Remove Duplicates
I have 3 fields: id, rating, and ip

sample format:

id | rating | ip
1 | 10 | 1.2.3.4
1 | 10 | 1.2.3.4
1 | 10 | 1.2.3.4
1 | 10 | 1.2.3.4
1 | 10 | 1.3.2.4
1 | 10 | 1.1.2.2

I need a query to remove all duplicate ips for rating 10 for id 1 so the above sample will end up like this:

id | rating | ip
1 | 10 | 1.2.3.4
1 | 10 | 1.3.2.4
1 | 10 | 1.1.2.2

please note that id ranges from 1-1000 and rating ranges anywhere from 1 to 10 so I need a query to remove dupe ip's from ratings 1-10 for each id

Delete Duplicates
what is the best way to:

1. detect duplicate rows
2. delete duplicta rows but keep original

i'm using mysql 4.0.23

Removing Duplicates
I am trying to remove all the duplicates in a large table. I found the post on here regarding finding duplicates which gave the following SQL query

SELECT Domain_Name, Count(Someotherfield) AS CountOfEntry
FROM Domain
GROUP BY Domain_name
HAVING (((Count(Someotherfield))>1));

Search For Duplicates
I have a table w/ id, first name, last name, score. I want to do a search of all people who have are in the table more than once. ie. select * from table_1 where (first_name+last_name comes up more than twice.)

Finding Duplicates
I have a table with 3 columns 'sid', 'aid', and 'call'.

What i want to do is to create a query that shows me all records where
sid matches and aid matches but call differs.

For example, it would give me this result:

sid aid call
abc def gh
abc def ij

and not:

sid aid call
abc def gh
abc def gh

Currently i have a query that gives me all matches of sid's and aid's
but gives me call's that match and don't match.

Find Duplicates
Is there an easy way to find duplicate rows (where every field is the same) and then remove all but 1 of the duplicates?

Finding Duplicates
I have a database that lists words with their part of speech. The textfile that I uploaded it from was a combined list, so naturally it had duplicates (ex: 'moo' is featured twice as a noun).

My question is, which queries would select only those rows, and then subsequently delete one of them? I'm new to MySQL and I can't seem to figure out one that works.

Locating Duplicates
What is the quickest/easiest way to extract duplicate information from a database?

I have a database which consists of users details including email addresses. Some email addresses are repeated, so now I want to search for these and then delete the duplicate?

Is there way to do this through the phpmyadmin/mysql control panel - or what query would i need to use?

Selecting Duplicates
How can I modify this query so that it shows me records which are duplicates only (i.e. have the same domain name)?

SELECT Domain_Name, Status FROM Domain WHERE Status = 'Ready'; .

Duplicates And Appending
I have a MYSQL table with 6 columns. I want to be able to set up the primary key on the first column, which contains the ID number. My problem is this table has been created over time and as such there exists multiples of the same ID in column 1. I need to keep every instance of the ID and it's row but would like to combine them into 1 row in my table. Is there a way to do this in MYSQl?

Removing Duplicates
I have a table with about 50,000 entries.
Is there an easy way to remove all the duplicate entries?
I could go through the entire table one at a time, select the field then delete from the table where it equals that field and then re-insert it back in to the table.
Is there any easier way to do it?

Deleting Duplicates
i have a table that looks like this:

article_id | article_title

so there's a bunch of article titles with ids:

1 - google launches search
2 - local headlines
3 - how to cook

got it?

there are thousands of rows in this table. some of the articles have the same title(it's a duplicate). i need to write a script to remove the duplicate. ie. when there's two or more titles that are the same delete all of them except for one.
so i need a script that deletes duplicated rows basically.

Removing Duplicates With Unique IDs?
I searched around and found a few threads similar to my problem, but I wasn't able to find a very clear answer to this question.
Basically I have a huge database (8.2 m records) full of destinations, things like restaurants, hotels, etc, and there are a lot of duplicates. The duplicates are in the database from the source, and we cannot change that.
Some duplicates actually have up to 3 columns that are different. The ID is unique for each row, and the category number can be different as well. I've even seen a couple instances where the zip code does not have the leading zero. Here are some example rows:
ID_____NAME_____ADDRESS_____CITY_____STATE_____ZIP_____CAT
544 WAL-MART 1234 MAIN ST DALLAS TX 05424 1234
545 WAL-MART 1234 MAIN ST DALLAS TX 5424 4321
546 WAL-MART 1234 MAIN ST DALLAS TX 05424 1234
The reason why these duplicates exist (probably) is because one WAL-MART could fit into multiple categories, because it's a grocery store, a department store, a photo center, etc. I don't want all that junk, I just want 1 record for 1 Wal-Mart
How could I go about deleting these duplicate records? and if possible, I'd like to be able to choose which record is kept, based on the category ID.

Copy And Remove Duplicates
I have a table that I need to add a lot of new data into. But there are a couple of problems with that.....

First, the new data is not completely compatible with the old so I need to copy data from one column to another, if the second column is blank. I also need to make sure that there are no duplicate entries.

I am using MySQL version 5 and I am not real sure how to proceed.


Delete Duplicates From 2 Tables
MySQL
CREATE TABLE `subscribers` (
  `id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
  `email` VARCHAR(60) NOT NULL DEFAULT '',
  PRIMARY KEY  (`id`),
  UNIQUE KEY `email` (`email`)
) ENGINE=MyISAM  DEFAULT CHARSET=latin1;

I have an accompanying table that houses additional subscriber data, with the column sd_sub_id being the subscribers table FK

MySQL
CREATE TABLE `subscribers_data` (
  `sd_sub_id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
  `sd_ip_address` VARCHAR(60) NOT NULL DEFAULT '',
  PRIMARY KEY  (`sd_sub_id`)
) ENGINE=MyISAM  DEFAULT CHARSET=latin1;

Chances are that there won't be duplicate rows, but! I don't want to take that chance so I'm cooking up a 'remove duplicates' function. I have it set up to remove the duplicates from the subscribers table, but how do I get the corresponding records from the subscribers_data table and delete them as well?

MySQL
CREATE TEMPORARY TABLE subscribers_temp(id VARCHAR(10), email VARCHAR(60)) TYPE=HEAP;
 
INSERT INTO subscribers_temp(id,email) SELECT DISTINCT email FROM subscribers;
 
DELETE FROM subscribers;
 
INSERT INTO subscribers(id,email) SELECT id,email FROM subscribers_temp;

Removing Duplicates With Unique IDs?
I searched around and found a few threads similar to my problem, but I wasn't able to find a very clear answer to this question.

Basically I have a huge database (8.2 m records) full of destinations, things like restaurants, hotels, etc, and there are a lot of duplicates. The duplicates are in the database from the source, and we cannot change that.

Some duplicates actually have up to 3 columns that are different. The ID is unique for each row, and the category number can be different as well. I've even seen a couple instances where the zip code does not have the leading zero. Here are some example rows:

ID_____NAME_____ADDRESS_____CITY_____STATE_____ZIP_____CAT
544 WAL-MART 1234 MAIN ST DALLAS TX 05424 1234
545 WAL-MART 1234 MAIN ST DALLAS TX 5424 4321
546 WAL-MART 1234 MAIN ST DALLAS TX 05424 1234

The reason why these duplicates exist (probably) is because one WAL-MART could fit into multiple categories, because it's a grocery store, a department store, a photo center, etc. I don't want all that junk, I just want 1 record for 1 Wal-Mart

How could I go about deleting these duplicate records? and if possible, I'd like to be able to choose which record is kept, based on the category ID.

Easy Way To Delete Duplicates?
I had 2 membership db's. One had all the members (65K) db2 had all the members that have registered on the website. I am wanting to consolidate the two.

Basically i took both, and merged them together into their own db. now the membership list is about 80k. 20k or so of that are duplicates. My idea was to export the thing as .xls and remove the duplicates via Microsoft Excel. However, Excel has a limit of 65Kish rows. so i can't really go about that.

Special Remove Duplicates
I have a query that returns a result set that is composed of events with a start datetime and an end datetime. Some of these events occur during another event. For example:

User AAA Event 1 Start Datetime 2006-11-01 12:00:00 End Datetime 2006-11-07 18:43:00
User AAA Event 2 Start Datetime 2006-11-03 08:21:00 End Datetime 2006-11-05 19:32:00

I am trying to get a result set that does not include those events that occur during another event. Or for the example above I want Event 1, but not Event 2. I am looking for a little direction.

Subquery For Duplicates Removal
I have the following query. It produces some details for a specific dat ($scandat) and for a specific location ($_GET["dep"])

Select scanarch.*, substr(scanid,6)+0 as sort, c.senddep, c.deliverdep, c.docnum
from scanarch s,consignments c
where (s.scantype = 'O')
and s.scandate = '$scandat'
and s.depotscan<>0
and left(s.identifier,22)=c.identifier
and c.senddep = ".$_GET["dep"]. "
and s.identifier <> ''
order by sort";

It's a report of scans (barcodes) which match the criteria of the where clauses. Anyway, an item can be scanned twice which makes it look like there are more records than there really are.

So i want to select only 1 unique record per day. If the identifier and date matches, i only want 1 of the records. if the identifier matches but the date is different, i would want 1 unique per each day.

Hope this makes sense. I tried using a subquery, selecting distinct identifier/date before doing the above query (on the already "filtered" list of uniques) but it means a bunch of my needed output fields aren't available to the main query. And i can't include them in the subquery, because as you know the distint applies to every field, and including the "time" as well as the date, will make every single item unique.

Ignore Duplicates But Order By An Value
I don't find an SQL-Statement to get the result I want :)

The content of the table is something like this:
name, text, value
-------------------
aaa dsd 1.31
aaa ads 2.87
aaa fdsf 4.54
ccc fdsf 1.51
ccc ass 6.53
ddd dsd 7.31

I want to get the values, the text of each name ordered by value
So the result should be:
ddd dsd 7.31
ccc ass 6.53
aaa fdsf 4.54

The statement I build (which is wrong :) is
SELECT name,text,value FROM table
GROUP BY name ORDER BY value DESC

It returns:
ddd dsd 7.31
ccc fdsf 1.51
aaa dsd 1.31
So it take the first name found in the table to group them, and not the one with the highes value.

Selecting From Two Tables, But No Duplicates
I want to select from two tables. Each table has a column called "course_id". I want to select from one table if the "course_id" doesn't exist in the other table. How would I do this in one or multiple queries? I was looking at unions or joins but I don't know if that is the most efficient or easiest solution. Any help is appreciated.

Edit: I'm only really selecting from the first table, but making sure that the "course_id" doesn't exist in the second table.

Preventing Duplicates On Non-key Fields
is there a way that you can put an index on a field (like in access) so that duplicates aren't allowed? in access, you can put an index on a field and say duplicates OK or no duplicates allowed.

DISTINCT Outputs Duplicates
I have an attribute "src" in one of my MySQL database tables which contains values such as node/83 and node/84. My table contains 2 same values: node/83 and node/83
I'm using SELECT DISTINCT(src) to remove all duplicates from that attribute but as I just said, those 2 same values still show up in the results.

How come those duplicates were ignored when using DISTINCT? It it because of the / in the value?


Copyright © 2005-08 www.BigResource.com, All rights reserved