Tracking Forums, Newsgroups, Maling Lists
Home Scripts Tutorials Tracker Forums
 
  HOME    TRACKER    MYSQL




Table Design Advice


I'm looking for some advice on suitable ways to construct a table or tables to handle the following situation.

I have a dynamic and structured content requirement, where the user can define a new type of item with a new/different set of fields. So I have

item types:
- name, fields

items themselves
- one or more field+content pairs depending on the information in item-types.

That in itself is pretty straight forward, I can store the item details in a single table of item-id, field-id, value triplets. ie. my db could follow the structure shown at the bottom of the post. However, I need to be able to filter for items based on the values of several fields.

In order to do that I can join the item table to itself for each field after the first, e.g. SELECT item-id FROM item-details i1, item-details i2, item-details-i3 WHERE i1.item-id = i2.item-id AND i1.item-id=i3.item-id AND i1.field-id=<field1> and i1.value=<expr1> AND i2.field-id=<field2> and i2.value=<expr1> AND i3.field-id=<field3> AND i3.value=<expr3>;

If the item needs to be selected on five fields then there would need to be four joins. Seven ... six, etc, etc.

Also, I would like to be able to search the field values without reference to the fields themselves. e.g. Search for all items with a value containing "red".

Is this the best way to handle the problem?
Is there another table design that would work better?
e.g. for each item-type, use the field information to create a table specific to that item-type with columns for each of the fields.


Possible structure
=============================

item-type
- type-id
- type-name

fields
- field-id
- field-name
- other field data (description, validation, etc)

item-fields
- type-id
- field-id

items
- item-id
- type-id

item-details
- item-id
- field-id
- value




View Complete Forum Thread with Replies

See Related Forum Messages: Follow the Links Below to View Complete Thread
Design Advice
I am looking at designing a mini database to handle subsets of data at my company. You probably know the drill--it's too much to handle with a spreadsheet, but not enough to warrant hassling the DBAs with loading it into our main servers.

So I am looking at managing the data via a local instance of MySQL. The largest file I need to manage at the moment is about 100 columns x 165k rows.

Obviously, I wouldn't build table of that size. That's the whole point of relational DBs.

But here's the deal, at my company, once someone finds a way to do something, it tends to stick. So I'm looking at solving a few temporary problems by managing subsets of data from our main DB.

I have this suspicion that once I am able to do this, the demands will grow, and all the sudden, I'll get stuck managing a whole bunch a data for a bunch of people.

So I want to build the DB right. I've never built a DB before. I just query the main DB.

Without getting into proprietary details, I can say that I am looking at unique IDs that correspond to clients--about 165k, plus all relevant associated data.

What I am tracking is which clients have performed what jobs for us, when, and what the result was.

I need to be able to run reports on this, and I need to add data at the end of each week.

Given all of the above, what general advice do you have? Are there any huge pitfalls to avoid? Any thing that I absolutely should do? Keep in mind that for now, this is just going to be run on a Core2Duo Desktop with 2 gigs of ram, so anything I can do design-wise to speed things along will be useful.

Luckily, I am not in a time crunch. But thanks in advance for whatever advice you are able to come up with.

Table Structure Advice
I have two tables - Users and Groups. Pretty straightforward layout where the Users table has ID, last_name, first_name, etc... The Groups table is simply ID, name, description, etc... I want to include in my Users table a group_ids field. What makes this complex, is one user could be a member of just one or many groups, so I can't simply say User1 is in Group1. User1 may be in Group2, Group17,etc...

How does one (or how would you) apply this? I've thought about making the group_ids field a 'text' datatype and separating the group ids by comma or space., but for several reasons this doesn't seem to be the right way to do this.

Let me know if I haven't explained this well or if I should give an example of the data for clarification.

Advice On Table Setup To Maximize Efficiency
My table set is is garbage .. I knew this when I created the table a couple of years ago. Now that I have tonnes of data in it the queries are taking forever. So it's about time I constructed a more efficient table. I am turning to this forum for some advice.

My table stores RSS feed content in it -- every 15 minutes a cron job INSERTs data from 300+ feeds and dumps it into the table.

The table structure consists of:

id
dateUnix
headline - headline of the article
permalink - URL to the article
body - TEXT type - body of the article
author - person who wrote the article
authorURL - URL to the person's website
favico
sitetype - I categorize the types of sites (0 = news media, 1 = blogs, 2= videos, 3= podcasts)
vote - users vote on stories
verified - 1 = verified, 0 = censored/banned story
newsCat - 20 different categories
newsSubcat - 50 different subcategories

I think the best way to make this table more efficient would be to break it up into 3-4 different tables using the id as the key.

ps: my cron job checks for duplicate entries and does not insert duplicates -- so duplicate stories are not entered

Table Design Question? House Table, Owner Table, Code Violations Table - Best Way?
Given the tables:

HOUSE
house_ID
address

OWNER
owner_ID
name
telephone...

HOUSE_OWNER_JOIN
?

CODE_VIOLATION_HISTORY
house_ID
violation_ID
violationStatement
...

My goal is to be able to track code violations of the house PER owner.

For example, I need to display a page that shows the current house with it's coe violations and a link to show the HOUSE's history of violation regardless of owner, Like:

House 1009283
Address
Past history (link to the following)

House History
2001-01-04 Owner: John Smith Code Violation: Gutter issue
1999-06-01 Owner: John Smith Code Violation: Faulty Steps
1998-03-02 Owner: Sam Spade Code Violation: Driveway carcks
1990-01-12 Owner: Keith Sledge Code Violation: Grass untidy


For the design of the HOUSE_OWNER_JOIN table, I thought of two ways I could go on this and this is where I need your help.

Option 1:
Have the HOUSE_OWNER_JOIN table keep dates so I can track the ownership changes that way:

HOUSE_OWNER_JOIN
houseID
ownerID
dateOwnershipBegan
dateOwnershipEnded

then I could look up all code violations by date and associate them with their rightful owner.

==================================================
Option 2:
Have the HOUSE_OWNER_JOIN table be the primary keeper of identity data by adding a new primary key and changing the CODE_VIOLATION_HISTORY table to reference that table by chaning the referencing key from house_ID to house_owner_ID:

HOUSE_OWNER_JOIN
house_owner_ID
houseID
ownerID
dateOwnershipBegan
dateOwnershipEnded

CODE_VIOLATION_HISTORY
house_owner_ID
violationStatement
...

Table Optimizing For Big Table. Need Advice.
I need a little help to optimize requests on a table, and I have no
idea how to do it (I optimized already a lot other table, but this one
is a pain). My table looks like :

CREATE TABLE comments (
id int(11) NOT NULL auto_increment,
subject varchar(80) NOT NULL default '',
timestamp timestamp(14) NOT NULL,
user_id int(6) NOT NULL default '1',
com_parent int(11) NOT NULL default '0',
res_type int(5) NOT NULL default '1',
PRIMARY KEY (id),
KEY id (id),
KEY news_id.........

Table Design
I am designing a table for "friend relationships".

I have three possible solutions:
a)
user_id (int)
friend_id (int)
created_at (timestamp)

This will require 2 rows per friendship.
b)
user_id (int)
friend_id (int)
user_id2 (int)
friend_id2 (int)
created_at (timestamp)
This requires only 1 row, but 2 indices.
c)
user_id (int)
friends (text)

This contains a comma delimited lists of all friend_ids.

I am leaning towards option 3, ofcourse, adding & deleting friends will be a bit slower, but the bulk of the load is on reading of friend data anyways.

However I have a question if this would be slower:

QUERY 1
select items.* from items inner join friends ON friends.friend_id = items.user_id WHERE friends.user_id = [my_user_id]

QUERY 2
select items.* from items where user_id IN (select friends from network where user_id = [my_user_id]

QUERY 3
$friends (variable) = select friends from network where user_id = [my_user_id]

select items.* from items where user_id IN ($friends)

Query 3 requires 2 queries.

1. Which of these would be faster?
2. And would you still choose the solution you chose in 1. if I had to use the friend userlist on multiple occasions on the same page?

(select from groups, select from bulletins, select from items, select from users)



Table Design
I have a question about table design, here is the background.

The main table is "Customer" with standard stuff, first/middle/last name etc.. The next table is CustomerType. CustomerType has 16 types and a Customer can have more that one type. So below is what I think is correct design and I just wanted to confirm this is correct:

Table: Customer
CustID INT(11) not null primary key auto_increment
First varchar (15)
Middle varchar (15)
Last varchar (15)
ENGINE=INNODB

Table: CustomerType
CustID INT(11) not null
Type1 enum('Y','N') not null default 'N'
Type2 enum('Y','N') not null default 'N'
Type3 enum('Y','N') not null default 'N'
Type4 enum('Y','N') not null default 'N'
FOREIGN KEY(CustID) REFERENCES Customer(CustID)
ENGINE=INNODB

Is there a better way?

Table Design
I need have a table that will have a column that will have a numeric value. In some instances the numeric value will be a percent and in some it'll be a dollar value.

My questions are:

1. What is the best datatype to use? Currently I have it setup as a Decimal(10,2) datatype.

2. Guessing the datatype above is suitable, how can I tell if the value is a percent or a dollar amount?

3. There are some instances where, if the value is a percent, that it needs to be identified as < xx%. Using the Decimal dt, how could I record the "<" that needs to accompany it?

How To Design The Table?
i have a situation where a:

Table : member
One of the table field : hobby

And my member table need transaction and at the same time, i need a FULLTEXT search to be perform in hobby table

I know FULLTEXT support by MYISAM and Transaction supported by InnoDB

So how can i do that?

Table Design
This is more related to table design, but there is no forum for that so I figured I post it here since I'm using a MySQL server. Basically I have two ways of designing the structure a table and I need advice. The app I am building requires that a user have access to certain files, so I have to store which files they have access to in a table. I can either create it with two fields: one for user id and one for the files they have access to. The other way I was thinking of is to create a table with one field for user id and one field per file. The second option would allow me to create a client adminstration screen easier since there would only be one row returned per user. But it does run risk of having a lot of fields.

Table Design
What is the preference with tables and column numbers?
I was planning to build a table with 45 columns but is it better to split the table in to smaller tables with 10 columns each and then connect them via a primary key?

Table Design
I have a bunch of baskets. There are about 6 types of baskets, and some can only contain fruit, another vegetables, another cookies, and so on. Furthermore, each is designed to be able to only hold one of each type (i.e. the fruit basket has a slot for an apple, an orange, a peach, and a pear). Some baskets have more slots than others. Also, it is possible for a basket not to be full and have empty slots. Lastly, each basket type has a couple of unique fields (i.e. fruit basket has location grown, cookie basket has chefs name, etc).

I need to create a list of all items in all my baskets along with the basket number and basket type such as:
Basket 1, Fruit Basket, Apple
Basket 1, Fruit Basket, Peach
Basket 2, Vegetable Basket, Radish
Basket 3, Fruit Basket, Peach
...

I am having a difficult time figuring out how to structure the tables.

Table Design
I need a DB design to emulate a operating system so I can add folders
(Categories) with unlimited folders (Sub categories) in side folders

Mysql Table Design
Any advise are welcome with this inquiry. I have learned MySQL for two weeks after buying books from Kevin Yank and I would like to make a project called Attendance Monitoring System or TimeIn-TimeOut Monitoring System using PHP/MySQL flatforms. However, I have a huge vague or dont know what to do on database or table designing.

Is it feasible if I make a one table for employee's TimeIn and TimeOut? Or do I have to make a separate table for TimeIn and TimeOut. If so, how I would fetch and match employees TimeIn with Timeout for particular shifts. Have 3 shifts in work 23:00 - 08:00, 02:00 - 22:00 and 8:00 - 17:00 on weekdays and weekends are their rest days.

Database Design: Keep One Table Or Add More?
This is perhaps due to bad planning but regardless its now an issue.

I plan to have different sections on my site, each with a general article feel but displayed differently and whatnot.

I decided using a simple `article` table with things such as title, short description, body, author, and so on.

The problem is now I realize each section might need a bit more data than other sections.

The question is: Do I make these seperate tables or just add fields onto my existing article table to handle these edge cases.

There are a few issues with seperate tables. One being my comment system will no longer work since it assumes everything is an article.

Using one table with extra fields would solve this problem but then some of the columns will be null for each row.

I don't think that is such a bad thing, especially since I'm not going to have too many rows in the first place.

What do you guys suggest? I'm leaning towards adding more columsn to my existing database.

Table Design Dilemma
While designing the database backend for a character management system for a fantasy game, I've come across a dilemma:

In one particular table, there is a field that (without getting into the game mechanics) stores bonuses on equipment/effects/whatever. The problem is this: most of the time there is precisely one bonus, occasionally two or three. I'd say the statistical breakdown would be roughly:
1 bonus: 45%
2 bonuses: 35%
3 bonuses: 19%
4 or more: 1%
There's theoretically no maximum to the number of bonuses for a given piece of equipment/effect/whatever.

To make this clearer, here is an example of what a list of 4 bonuses might look like if represented as a comma-delineated string:
str+6,ac+23,init+4,dex-2

So how do I store this? My first thought was to use a Varchar field and programmatically break it apart in my application (trivial to do), but that would limit me to an artificial maximum. A many-to-many relational table also occurred to me, but this seems ridiculous when one realizes that a given bonus could exist solely from a single source; also, there's countless bonuses that are just minor adjustments to each other, for example one could be str+1, another str+2, another str+3, etc. Despite that, however, I'm leaning toward the many-to-many relational table as my solution.

I've also considered using a TEXT field, but I'm not sure that'd be the appropriate approach here.

So what do y'all recommend? The vast majority of the time, this data is small (well under 20 characters 99% of the time if represented as a string; the example above is 23 characters but is an example of the extremely rare 4-or-more case), but I want to be able to handle the theoretical case of a really really long list of bonuses.

Question On Db Table Design
I'm designing a db and have 2 ideas on the table design and was hoping I could get some comments on the pros and cons of the 2 designs.

Basically, this is for toolbar information that will be stored in db tables, and the data is somewhat in an outline form such as:

Toolbar Heading 1
Menu Heading 1
Link 1
Link 2
Menu Heading 2
Link 3
Link 4
Toolbar Heading 2
Menu Heading 3
Link 5
So, there are 3 tables: Toolbar Heading, Menu Heading, and Links. For simplicity, I'm leaving out all the info that would be in those tables. Here are my 2 design options:

1. Add a field in the Menu Heading and Links tables indicating the "parent ID." For example, Menu Heading 1 would have a parent field containing the ID of Toolbar Heading 1. Likewise, Link 2 would have a field containing its parent: Menu Heading 1. And so on.

2. Don't put the parent IDs in the Menu Heading and Links tables. Instead, create 2 tables that indicate the relationships. For example, a Toolbar Headings To Menu Headings Table would list the IDs of all the Menu Headings and their associated Toolbar Headings. Likewise, there would be a Menu Headings To Links Table indicating all the Menu Headings and their associated Links.

Db/table Design Question
I have table/db I need to create that has a rectangle structure even
when normalized, but the difficulty is that the rows can have a large
number of column - over the max of what mysql can handle. If I put the
data in vertical format I lose a lot of functionality. If I split the
tables into n number of tables and keep the data on the horizontal it make
for some combersomeness. I'm sure this is a common problem but I have
never need more that 5-20 columns per table.

I also am considering writing code that may compress/decompress the
columns into say a blob, but this again leaves me in the cold, with out
the power of sql.

Design Of Table For .doc Archive
I'm trying to build an archive of documents and need some help with the table structure to make the archive searchable.

An example of my doc's are:

Main category: Holiday, Lifecycle, History
Sub Category: New Year, Birth, Independence Day

Some files may be a main category i.e. a general doc on holiday, others could belong to more than one sub category i.e. a main category Calendar would have New Year and Independence Day as well. I also don't know if I could get them all under a main and sub and may need a third level.

I would like to the user to find the file by 1) search box, 2) keyword index, 3) click on main category which will bring to sub, and so on. 4) search through drop down menu. (Did I miss any?)

Am I building this properly with main and sub cats, or is there some other method? And how do I structure the table for sub and sub sub cat? What do I do with a file that doesn't really belong anywhre, make a misc.? Code:

Table Design Question
I am building a website that allows users to make a custom playlist for the site's media player. The playlist has no maximum number of entries, and each user to this site will have their own playlist. What would be the best way to organize this data?

Should each user have their own table, with simply a list of which files to play? Or should I put all playlists in 1 large table? Is MySQL the best way to store this data?

GeoTarget And Table Design
I want to add GeoTarget feature to my in-house AD system. The problem is, how to add country attributes to my AD table. For example, to select all ADs for US customers, the query should be
select like this: SELECT * FROM table WHERE us=1.

To select all ADs for CA customers, the query will be
SELECT * FROM table WHERE ca=1.

But it is impossible to add N columns (us, uk, ca, cn...) to represent N countries. What is your suggestion? Can I use "Bit AND" operator on an index in WHERE clause?

Normalization: Two Examples Of Table Design. Which Is Better?
Assuming I have an application where the user will only have one mother, father, and spouse, which is the more logical format to use?

CREATE TABLE user
(
id INT UNSIGNED NOT NULL AUTO_INCREMENT,
created DATETIME NOT NULL,
fname VARCHAR(50) NOT NULL,
lname VARCHAR(50) NOT NULL,
bday DATE NULL,
PRIMARY KEY (id)
) ENGINE = INNODB;

CREATE TABLE mother
(
user_id INT UNSIGNED NOT NULL,
fname VARCHAR(50) NOT NULL,
lname VARCHAR(50) NOT NULL,
bday DATE NULL,
PRIMARY KEY (user_id),
FOREIGN KEY (user_id) REFERENCES user(id)
) ENGINE = INNODB;

CREATE TABLE father
(
user_id INT UNSIGNED NOT NULL,
fname VARCHAR(50) NOT NULL,
lname VARCHAR(50) NOT NULL,
bday DATE NULL,
PRIMARY KEY (user_id),
FOREIGN KEY (user_id) REFERENCES user(id)
) ENGINE = INNODB;

CREATE TABLE spouse
(
user_id INT UNSIGNED NOT NULL,
fname VARCHAR(50) NOT NULL,
lname VARCHAR(50) NOT NULL,
bday DATE NULL,
PRIMARY KEY (user_id),
FOREIGN KEY (user_id) REFERENCES user(id)
) ENGINE = INNODB;


*** .....OR..... ***

CREATE TABLE user
(
id INT UNSIGNED NOT NULL AUTO_INCREMENT,
created DATETIME NOT NULL,
PRIMARY KEY (id)
) ENGINE = INNODB;

CREATE TABLE family
(
user_id INT UNSIGNED NOT NULL,
cat_id TINYINT UNSIGNED NOT NULL,
fname VARCHAR(50) NOT NULL,
lname VARCHAR(50) NOT NULL,
bday DATE NULL,
PRIMARY KEY (user_id, cat_id),
INDEX (cat_id),
FOREIGN KEY (cat_id) REFERENCES cat(id)
) ENGINE = INNODB;

CREATE TABLE cat
(
id TINYINT UNSIGNED NOT NULL AUTO_INCREMENT,
name VARCHAR(10) NOT NULL,
PRIMARY KEY (id)
) ENGINE = INNODB;
INSERT INTO cat (id, name) VALUES (1, 'mother'), (2, 'father'), (3, 'spouse');

Good Table Design Question
I have a quick question, let's say I have a table w/ 8 million rows, and each row is indexed by userid and "type", and let's say each user can have 8 "types". Now I was thinking of separating the tables into 8, based on the type, so I would have something like user_type1. And each row will have a unique userid as primary key.
Now the question is, is it better to separate the tables or keep it as one huge table?
Sample queries will look like:
- SELECT * FROM big_table where userid = 123 => this will return multiple rows
or some of these:
- SELECT * FROM small_table1 where userid = 123 => this will always return 1 row because userid is primary key



MySQL Table Design For Web Tracker
I'm currenty writing a PHP based tracking script and was wondering about an expedient mysql table design. My experiences with mysql are not as good as with PHP at all, so I need a little help.

The main problem is performance, so you may already asume that I'm not looking for poorly conceived table structures but some kind of denormalized ones, in order to improve the performance.

So whats the best way to store the logged information? What you see here is the db structure from phpOpenTracker and that exactly is the way it shouldn't be.
This normalized schema is horrible slow if you have to evaluate a lot of data.

I know a lot about application design but not about db design, so it would be great if anyone could give me some approaches and/or examples.

Trying To Conceptualize Table Design For Stats
I have a stats page my users can go to so they can view how much traffic their listing has received.

Creating this for the total views is straight forward and I have that working. But I want to add two more features to this:

1. See the total views for just today
2. See a history for the listing over the duration of the post (ie. 30 days).

I am at a lost how I would create a table or several tables to accomplish this. I am looking for some guidance to get me thinking about the best way to do this.

Conceptualize Table Design For Stats
I have a stats page my users can go to so they can view how much traffic their listing has received.Creating this for the total view stat is straight forward and I have that working. But I want to also add two more features to this:

CODE1.    See the total views for just today
2.See a history for the listing over the duration of the post (ie. 30 days).

Search Design: Keywords Table (MySQL)
I'm trying to implement keyword searching in my database.
Currently, I have a table containing all my entries with description and name.
The existing sql search utilises fulltext search to find the relevant data from the description and name fields. I'm thinking of adding an additional keyword table that would have higher relevancy in returning results (currently, you'll be lucky to find any relevant entries).
I have a separate table and the design is as follows:
REATE TABLE `keywords` (
`id` smallint(5) NOT NULL auto_increment,
`entryid` mediumint(9) NOT NULL,
`keyword` varchar(20) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=0 ;
My other table that contains the description and name fields are `entries`

My problem now is, how can I match the search keywords with that of the table `entries` and `keywords`? I want to avoid using more than one sql query for each search.

For each entry in the table `entries`, there may be up to 20 keywords (e.g. 20 rows in `keywords` table). I'm open to another approach for the keyword table. The reason why I have a separate table is to facilitate management and hopefully be able to pool results in a way of a cloud for keyword statistics.
Can anyone offer any advice/help in this regard?



Table Design Question (normal Forms)
I have a 5 web forms that share 7 fields. Then each form has separate fields depending on the form type. Here's the breakdown of the number of non-shared fields:
form 1: 3
form 2: 13
form 3: 6
form 4: 1
form 5: 2

This table(a) will grow but it will never be gigs of data, just MBs of data.

It would be MUCH simpler to have one large table with all the fields and add a form_type field to differentiate.

In the long run, in your experience, is that the best way to do it or is it best to have one table for shared fields and separate tables for each form type?

Cocktail/Drink Mixes Table Design
I am thinking about developing a website that has a big databse of different alcoholic drinks, but the database and code is going to be huge so I was going to see if someone already has something like this I can use as a base or knows something I could use as a base... A example of something I want to make is at http://www.drinksmixer.com/.

I have kind of sorted some MySQL tables I think.

catergories:
catergory_id
catergory_name

ingredients:
drink_id
ingredients_amount
ingredients_name

drink:
cat_id
drink_id
glass_id
drink_name
drink_directions
drink_comment

glass:
glass_id
glass_name
glass_description

I think that is right... I don't if I have the ingredients rigt... Maybe I should just include that in the drinks table?

ingredients table:
1 1.5 Tia Maria
1 1.25 Vodka

And then when viewing a drink just grab the drink_id and match it against the drink_id in ingredients. Please help me atleast get the MySQL tables up to scratch as this is the main part.

What else I should I do with these tables so that they are linked correctly? I am new to both PHP and MySQL and I want to the database tables finalized before I start any code.

Table Design For Diff. Levels Of Quiz
I have diff. levels starting from level 1 to say level 10 and each level has
10 questions and each question has four options.

The Table for one level is as shown below:

CREATE TABLE quiz(
quizid tinyint(4) NOT NULL AUTO_INCREMENT PRIMARY KEY,
quizlevel VARCHAR(50) DEFAULT NULL,
quizquestion VARCHAR(255) DEFAULT NULL,
quizoption1 VARCHAR(255) DEFAULT NULL,
quizoption2 VARCHAR(255) DEFAULT NULL,
quizoption3 VARCHAR(255) DEFAULT NULL,
quizoption4 VARCHAR(255) DEFAULT NULL,
quizanswer VARCHAR(255) DEFAULT NULL,
UNIQUE KEY quiz_level (quizlevel)
);
How to create the same for various levels?

Table Design To Emulate Folders And Subfolders
I need a DB design to emulate a operating system so I can add folders (Categories) with unlimited folders (Sub categories) in side folders Any suggestions?

Table Design, Normalization, Foreign Keys
In building a MySQL database what are the best practices to optimize speed and normalize?

This is a food management database. Tables are such as item, manufacturer, brand, vendor, store location, nutrition facts, price, store dept, packaging, item type. And a few others.

Given:
1. Tables are concise and have keys on all relevant search fields.
2. All tables have a prime, integer id field. (First field in table.)
3. Any repeating data field will be normalized. (Except for a few very minor areas.)

Questions:
1. Use where field=field clauses to link the various tables only or mostly?
2. Use joins?
3. Use foreign field requirements to insure links and data completion?
4. Use triggers & procedures to do (3).
5. Have application programmers enforce the business logic? (Not really favored.)
6. Which approach is fastest? Where clauses to join, join clauses? Does it matter?

Any recommended design tools? Am using MySQL Browser & Administrator.

Table Design For Storing Item Stats
I've created quite the database for my app, and I'm familiar with normalization and general db design. However, I'm ready to design my tables for recording all sorts of stats for my db items.

I want to be able to view stats for each item, such as number of small views, number of large views, number of downloads, ratings and so on. I also want to be able to view these items by periods of time, such as by day, week, month, by multiple months, by year, etc. So, I'm trying to figure out what the most efficient way of recording this data, so that I can keep my tables as small as possible. I'm wondering if I will have to insert a row for every stat, or if I should enter a row for every day, or only a day if there is activity, etc.

Large Table, Fulltext, Design Decisions, Optimization
Maybe somebody can give me some suggestions, ideeas and help on my problem. I'm not a newbie with SQL but i'm not too great either, i usually don't care about optimization because i don't need to. I don't use forums too much and when i do i see replies like "you are a moron/idiot/stupid". If you want to reply with something like that for whatever reason at least tell me the reason you think that so i get enlightened .

First of all let me explain the application I am developing so it is more clear. Brefly said, it will be a price comparison and social shopping site. Similar applications (sites) are froogle.com and shopping.com. The site's main features will be:

* Product listing by category
* Price comparison for a product using prices from several stores that sell it
* Reviews for products
* Reviews for stores
* And many more that are not important right now

First of all i have a problem regarding an important decision. This site must be multilingual so the question that arises is if I keep everything in a single database, in a single site or if i keep a database for each language. I am thinking to go with a database (or even a diffrent domain or subdomain - like uk.site.com , fr.site.com) for every language, and a main database that keep information thatis not language specific (like product attributes). The reason i'm inclined to do it like this is that a user from France will be interested only in products wich are being selled in France and if i keep all descriptions for all languages for all products in a single table i will have a huge table for no other reason than to keep everything in one place.

So what is your oppinion regarding this? Please back up your oppinion with facts and reasons.

The next problem is the huge ammount of data i will have to work with. Though the site will start verry small (on a shared/virtual host) if it will become popular (and i will do my best for that) i will not have a problem switching to a dedicated server or a cluster or whatever, but i want it to be as optimized as possible. So I was talking about large ammount of data as being a problem. That's because i did a benchmark on my PC and got really unplesent results (details about my PC configuration, servers versions, MySQL configuration and SQL of the database are at the bottom of my post so they don't get in the way):

Records in products table: 844701

Times: (number in pharanthesis is the real one, i used MySQL Query Browser)
------
SELECT COUNT(*) FROM products;
0.0092s (0.0019s)
SELECT * FROM products WHERE id=700123;
0.0183s (0.0349s)
SELECT id, brand_id, title FROM products WHERE categ_id=3 LIMIT 0, 20;
0.0128s (0.0501s)

Untill here everything is just great. As you see last query could be to get first 20 products in category with ID=3 (all products have categ_id=3) witch is usefoul for paginating. But when i change the same query LIMIT from LIMIT 0, 20 to LIMIT 700000, 20 i get the result set in 0.0096s (20.5210s), that is 20 seconds witch is unaccepted (even if most users don't browse but search instead).

So what happens if i search for something with LIKE?
First of all we need to find out total number of results the search returns:


SELECT COUNT(*) FROM products WHERE title LIKE '%nvidia geforce%'
0.0304s (3.3100s) - that's alot

Now let's get first 20 results:


SELECT id, brand_id, categ_id, title FROM products WHERE title LIKE '%nvidia geforce%' LIMIT 0, 20;
0.0092s (0.0047s) - that's just great

What about if we want to order by title column?


SELECT id, brand_id, categ_id, title FROM products WHERE title LIKE '%nvidia geforce%' ORDER BY title LIMIT 0, 20;
0.0099s (26.7591) awww.... 26 seconds, but if I execute same query again... it's just 0.5 seconds (i belive it's caching or something)

What if i want to see last results for this search?


SELECT id, brand_id, categ_id, title FROM products WHERE title LIKE '%nvidia geforce%' LIMIT 231000, 20;
0.0029s (5.3272) (if i execute again i get the result in 2.8 seconds)

Well i could throw in some caching techniques at the PHP side, maybe i will have justa few visitors, etc... but

i will defenetly not use LIKE to search. I want to use FULLTEXT. So let's see some results with that:
First let's get the number of results the search would return:


SELECT count(*) FROM products
WHERE MATCH (title, brief_desc, full_desc) AGAINST ('norwood nvidia geforce2');
i'm getting old here waiting for a result... it took 62 seconds and returned 406158 results. If i execute the query again it takes only 5 seconds but that's still alot.

Now let's get the first results ordered by score:


SELECT id, brand_id, categ_id, title, MATCH (title, brief_desc, full_desc) AGAINST ('norwood nvidia geforce2')
AS score
FROM products WHERE MATCH (title, brief_desc, full_desc) AGAINST ('norwood nvidia geforce2')
ORDER BY score LIMIT 0, 20;
it was fast 3.4 seconds
Let's get last 20 results:


SELECT id, brand_id, categ_id, title, MATCH (title, brief_desc, full_desc) AGAINST ('norwood nvidia geforce2') AS score
FROM products WHERE MATCH (title, brief_desc, full_desc) AGAINST ('norwood nvidia geforce2')
ORDER BY score LIMIT 406100, 20;
this one took 6.7 seconds

If i search for just 'geforce2' it does it in 0.6 seconds (beats me why so fast)

Now i'm impressed. Yesterday i did a fulltextsearch (don't remember the string) and it took 200 seconds. Meanwhile i defragmented my HDD maybe that has something to do with it.

So for now my questions are:

* How can I optimize these queries and/or database structure?
* Is this the best i can get out of it (on my computer)?
* Any of you knows or can estimate what to expect from a 2xDual Core Xeon 3.2GHz (4 virtual CPU's), 2GB RAM, 2 SCSI HDD witch runs just Apache, PHP, MySQL, Mail on Linux?
* Also how do these timings change in the case of a 10 milion rows table (instead of under 1mil like is now)?
* Why is MySQL Query Browser showing me two times (one in pharanthsis witch is the real one)?


I also have a database design issue. I need to keep attributes for products. For example for digital cameras i need to have Optical Zoom, Sensor Resolution, etc. I can have a table like this:


CREATE TABLE prod_attrib (
id bigint(20) unsigned NOT NULL auto_increment,
prod_id bigint(20) unsigned NOT NULL,
attrib_id integer unsigned NOT NULL,
attrib_value varchar(255)
);
But with this approach if i have 10mil products and an average of 10 attributes for each product i end up with 100mil rows table witch i think is a problem. If i keep a table for each category of products (digital_cameras, mp3_players, etc.) and each of these has a specific structure i can't do an overall search. Any of you knows if the MySQL 5.1 XML support will help me with this? Please tell me how you would do this.

Thanks alot to everybody who took the time to read all this. I'm waiting for your answers/suggestions.
---------------------------------------------------
PC Hardware:
------------
CPU: AMD Athlon XP 2200+ (1.81GHz)
RAM: 768MB DDR @ 400MHz
MB: Gigabyte KT600
HDD: (1) Maxtor DiamondMax 9, 80GB, ATA 133 AND (2) Maxtor DiamondMax 10, 250GB, ATA 133. Both HDD are on the

same ATA 133 Cable.

OS: Windows XP Pro (SP2)
Servers: Apache 2.0.59, PHP 5.1.4, MySQL 5.0


MySQL Instalation Details (Server Instance Configuration):
----------------------------------------------------------
- Server type: Developer Machine
- Database usage: Multifunctional Database
- InnoDB Tablespace Settings: default/unmodified
- Concurrent connections: Decision Support (DSS)/OLAP (A number of 20 connections will be assumed)
- TCP/IP Networking enabled, Strict Mode enabled (as recommanded)
- Standard character set (latin1)
- Installed as Windows service

I have created a database (g2b_benchmark) for benchmarking purposes with the following structure:
Note: I used a PHP script to populate the products table with 844701 rows. Some rows are

repeating but not consecutivelly.

CREATE TABLE `g2b_benchmark`.`brands` (

`id` int(10) unsigned NOT NULL auto_increment,

`name` varchar(100) NOT NULL,

`site` varchar(100) default NULL,

`email` varchar(100) default NULL,

`description` text,

PRIMARY KEY (`id`),

UNIQUE KEY `Index_2` (`name`),

KEY `Index_3` (`description`(45))

) ENGINE=InnoDB DEFAULT CHARSET=utf8;


CREATE TABLE `g2b_benchmark`.`categories` (

`id` int(10) unsigned NOT NULL auto_increment,

`parent_id` int(10) unsigned NOT NULL default &#390;',

`lang` char(2) NOT NULL default 'en',

`name` varchar(100) NOT NULL,

`keywords` varchar(255) default NULL,


PRIMARY KEY (`id`,`lang`,`parent_id`),

UNIQUE KEY `Index_3` (`name`),
;category names must be distinct
KEY `Index_4` (`keywords`),
;not important right now
KEY `FK_categories_1` (`parent_id`)
;parent_id is the ID of the parent category
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

CREATE TABLE `g2b_benchmark`.`products` (

`id` bigint(20) unsigned NOT NULL auto_increment,

`categ_id` int(10) unsigned NOT NULL,

`brand_id` int(10) unsigned NOT NULL,

`lang` char(2) NOT NULL default 'en',
;i thought i throw in the table all products for all languages
`title` varchar(150) NOT NULL,

`brief_desc` varchar(255) NOT NULL,

`full_desc` text NOT NULL,

PRIMARY KEY (`id`,`lang`),

KEY `Index_3` (`categ_id`),
;fast search/ordering by category
KEY `FK_products_2` (`brand_id`),
;fast search/ordering by brand
KEY `Index_5` (`title`),
;fast ordering by title
FULLTEXT KEY `Index_2` (`title`,`brief_desc`,`full_desc`)
;i need fulltext search on these
) ENGINE=MyISAM DEFAULT CHARSET=utf8 ROW_FORMAT=DYNAMIC;

MySQL Table Design Style With Foreign Keys
I just wondering what proper (My)SQL style would be for the following situation (which I made up for the sake of this)...

You have a SpiceRack table and a Spice table. Each rack has many spices, in typical one-to-many fashion. So the Spice table would have rackID column. Easy.

But let's say each rack has one default spice.

So should the SpiceRack table have a column "DefaultSpiceID" or should the Spice table have a column "isDefault"?

In one sense, I would say the former, to help enforce a single default spice. In another sense, I would say having two tables point to each other is ugly and just do some extra queries to enforce that there is only one with the latter.

And while I'm at it... for that isDefault, and similar T/F columns... What's the best data type? SMALLINT (1) UNSIGNED, ENUM(0,1), or LONGBLOB?

Migrating Data From A Brain-dead Table Design To A Good One
I'm redesigning the database schema of one of our applications, and a lot of brain-dead decisions were made in the original design. In this specific example, a relational table was used when it was completely unneccessary.

To be even more specific, we have a table for event times (call the table "times") and a table for people (call the table "people"). Each entry in the "people" table can only be associated with one event time (yes, if one person wants to sign up for more than one event time, there will be two separate entries, and we have good reason for that). However, the original creator made a rel_times_people table.

In my redesign, each entry in the "people" table just has a "time_id" field to associate it with the correct event time. I'd like to migrate the time_ids from the rel_times_people table to the "time_id" field in the people table (as I said, there is only one entry in the rel_times_people table for each person). Is there a way to do this using a query (or perhaps using some feature in phyMyAdmin that I don't know about)? Or will I have to do this by hand (I'd REALLY prefer not to; there are over 1000 entries)?

A Lil Advice
i'm looking to build a relational database that will hold product reviews and statistics that i will eventually publish on the web where they will be searchable, etc. i'm working on a mac os 10.3.9.

is mysql the way to go? what app should i be using to create the database?

Advice
I own a company whereby my 50 employees (soon to be 100) all work from home and access our server via terminal services client. These 50 people work on 40 separate Access databases. Some small, some large (over 100 mb).
My IT consultant tells me that I should move to a sql server (which I agree), upgrade to Access 2003, and minimize term server use by moving to a web based input % reporting engine for common employee tasks (due to the load of 100 employees on Access).
- Access 2003 is $200 a user (are there cheaper options)
- We regularly modify the Access databases. Note that I created the databases but I am not a developer. I an very comfortable in Access (its the only database software I have ever used).
So, is MySQL an option for this environment?
I know its a big, ugly question with few details, but I am looking at all options.

Advice Requested
I'm going to be creating a signup form with the standard fields (fname, lname, email, etc), plus user definable fields. Currently, I'm kind of stuck on how to go about doing this.

My users table has an id field (PK), email field, IP address field, etc. I want to give the site admins more control over what they can put into their signup forms. Some people may want to have a website field, others may want to have a field for the user's fav book. Whatever.

Now, if I put these into the users table, I'm now fixed at how many fields I can have, correct? I mean, I can set a determined number of fields and call them custom1, custom2, etc. and let the admin define their user-readable names, but it's still a fixed amount of fields.

I could create a new table with 2 fields - name and value - where name is the PK. That way the admin could enter as many as they want. But how would I then have the same fields in the users table to hold the data?

Best Architecture Advice
Let's say you have 800,000 employees (huge company huh?), 300 sectors, and 3 security levels for each sector. You want to hold the security level for each sector for each employee. I can't figure out the best way to do this. Right now, my company has a table for each sector - each with columns [employee_id, security_level]. This results in 300 different tables and the joining when searching on several sectors gets crazy. I'm trying to figure out an easier way to structure this.

If you have one table [employee_id, sector_id, level_id], because of the many-to-many relationship between employee_id and sector_id - that's going to give you a 240,000,000 row table. If you have one table made up of a columns for each sector holding the security level value [employee_id, sector1, sector2, sector3, etc] you have a table with 301 columns, which will put me way out of my index max limit.

Optimization Advice
I'm having Database connection problem on my VPS server runing SMF forum script.
It often displays "unable to connect to database" notification when there is more than 200 users online (in 15 minute period).
I have raised max mysql connection number from 100 to 150, but that didn't help.
Connections per second number has doubled in last month (but traffic is only slightly higher)

Here are some mysql informations, any advice how to get this numbers to normal values is welcome.

Advice Sought
I'm somewhat new to database design in MySQL from a Designing from the "Ground Up" Point of view. I mean I've used Databases in a limited capacity, but this is my first time developing one from scratch, and mainly its for educational purposes, so I can eventually do this myself, rather than hit hotscripts, or some other repository but will be for a functional website at the same time.
I have a library of links I would like to display. My client has presented me with a word doc with the links and a description.
I assume I would just create a table with Category,Title,URL,Description as fields, but if I want to present the end user with a dropdown list at the top of the page that allows them to select a category, would I have to create a separate table with just the categories, and link it to the 1st DB?

Advice On How To Structure Tables
I currently have a dynamic site that displays info about different musicians. My database is currently set up with a table that contains all the infor about each musican such as name, type of music, website address etc. i.e all single bits of information.

Now I would like to add a form to allow the musicians to upload one or more mp3 files.

It would be fine if it was just one mp3 file as I would simply have a field in the table to store the name of the mp3 file.

However I am not sure how to deal with multiple mp3 files. Do I just create say 10 fields in the table for a maximum of 10 mp3 files for each musician? This doesn't seem the mosst sensible idea but I am fairly new to php and mysql so I am not sure what alternatives there are.

I would appreciate some advice on this. I was thinking perhaps I could create a separate table to store all the names of all the mp3 files for all musicians but doing this I do not know how I would know which ones are related to each musician.

Also related, should I store the mp3 files in separate folders for each musician or just have one mp3 folder on the server and save them all in here?

Advice On Hierarchical Schema
I'm trying to come up with a scema for a hierarchy of shopping categories.

Each category can have many children categories and many parents. Ie. exist more than once, and at different levels of the hierarchy. There will be a limit of 5 to the depth of the tree, but some branches will terminate sooner. At the end of the branch will be a link to products within that category.

I've read up on hierarchical structures, even read joe celko's book on the subject (I've just recovered from the resulting brain embolism ) but haven't really found solutions that seem to handle the many-many relationships going on, and in the case of the celko book there was a lot that mysql 4 can't do.

I'm using version 4.1.20 (stuck on redhat and it doesn't support 5)



Advice On Plan Of Attack
OK, I have a large web based questionaire form I am constructing using PHP/MySql. There are around 50 questions with each question having anywhere between 4 and 30 possible answers, depending on the question, that the user can select using checkboxes, meaning they can have multiple checkbox answers per question.

My question is what would be the best method to store the checkbox data in MySql, keeping in mind there is also is going to be a search routine that will need to pull the data back out efficiently?

Should I | delimit each checkbox answer in a mysql field for each question so there are 50 fields with pipe delimited responses, or do I make a field in the table for each possible answer (around 750 fields)? Or is their a better method???

With so many checkbox answers, what is the best/ most efficient way to store their responses knowing the data will need to be searchable?

I can get through the programming side of it, although I am not an expert by any stretch of the imagination. Just looking for advice on how to set it up before I get started.

Don't want to do it one way then discover later that the search is too slow because I didn't lay it out right first

Beginners Advice / First Database
I am creating my first database / website and would appreciate advice on how to organize or set up the tables.

what I have is user uploaded documents.

CREATE TABLE `documents` (
`doc_id` int(10) unsigned NOT NULL auto_increment,
`member_name` varchar(255) NOT NULL,
`language` varchar(255) NOT NULL,
`country` varchar(255) NOT NULL,
`pgrating` varchar(50) NOT NULL,
`document_name_file` varchar(255) NOT NULL,
`type` mediumint(2) default NULL,
`document_pathto` varchar(255) NOT NULL,
`category` varchar(255) NOT NULL,
`keywords` varchar(255) default NULL,
`series` varchar(255) NOT NULL,
`title` varchar(255) NOT NULL,
`date` date default NULL,
`votes` mediumint(255) NOT NULL default &#390;',
`price` mediumint(255) NOT NULL,
`rating` mediumint(255) NOT NULL default &#390;',
`purchased` mediumint(255) NOT NULL default &#390;',
`leamount` varchar(255) default NULL,
`leExpired` mediumint(2) NOT NULL default &#390;',
`userDisabled` tinyint(1) NOT NULL default &#390;',
`lowrateDisabled` mediumint(2) NOT NULL default &#390;',
PRIMARY KEY (`doc_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=0 ;

My question is, for searching for these should I make a lot of lookup tables such as?

category id | comic id

some of the documents can be (changed to) private and not displayed in search results.

so I created a table "a_docs"
user_name(id) | active doc (id)

also the visitor can sort by keywords, ratings, category, country, language, latest and type

My question is, should I create a "lookup" table for each of the search criteria and then do something like a join (I honestly don't have a clue what I am talking about yet) ?

I also need to generate reports for the members such as how many and what documents were purchased.

my guess is a lookup table
purchased_docs

user_name(id) | purchased doc (id)

then once I have the doc (id) for the user I can then lookup the "documents" table to find the price for that document (id). Is this a good approach?

I expect to eventually have hundreds of thousands of documents and after this is built I will have someone go over the structure to make sure it is fully optimized.

btw I have been reading documentation and in my table the interger values and settings are all wrong.

I am good with php and have built dynamic websites with user editable content in flat files but now its time to jump on the mysql wagon.

Query Optimization Advice?
I am familiar with the VERY basics of MySQL (in other words, I am not an expert...), and I am currently working to create a simple search engine on our website. Users should be able to search a database of over 150,000 sheet music titles by simply inputting a keyword(s) phrase. That phrase should be searched into different tables at the same time but if the keyword phrase includes more than one word, any word need to be searched.

For example, let's say an user is looking for "Brahms violin concerto". The search engine should display all the records that matches all those entered keywords.

Here are the tables that need to be searched:

|title|composer|instruments|description|


and here is the query I am trying to work on (after have split the keyword phrase into separate keywords):

Quote:

SELECT title, composer, instruments, description, price FROM mydatabase WHERE (title REGEXP "(^| )brahms( |$)" OR title REGEXP "(^| )violin( |$)" OR title REGEXP "(^| )concerto( |$)") AND (composer REGEXP "(^| )brahms( |$)" OR composer REGEXP "(^| )violin( |$)" OR composer REGEXP "(^| )concerto( |$)") AND ((instruments REGEXP "(^|;| )brahms(;| |$)" OR instruments REGEXP "(^|;| )violin(;| |$)" OR instruments REGEXP "(^|;| )concerto(;| |$)") OR (description REGEXP "(^| )brahms( |$)" OR description REGEXP "(^| )violin( |$)" OR description REGEXP "(^| )concerto( |$)")) order by title like "%brahms%", title like "%violin%", title like "%concerto%", instruments like "%brahms%", instruments like "%violin%", instruments like "%concerto%"

The query works, but it takes too long to be executed... over 10 seconds! I am aware that REGEXP (like "LIKE") don't uses indexes, but I cannot find a different solution to match any possible keyword or part of it.

Any suggestion to optimize it? Or should I work on a completely different approach?

Performance Advice Required
I'm after some performance advice if anyone has some. I've a table and query described below. The table holds about 13 million records. Query times for the query are about 135 seconds, which is way to long for live queries. Does anyone have any suggestions on how to reduce this. I realise live queries to a database this size might have to be tackled by spliting into other tables etc, but other sites manage this, and I just don't know how. I have already sumerised the data as much as possible to minimise the rows.

dns_stats CREATE TABLE `dns_stats` (
`hits` mediumint(11) unsigned default '0',
`domainname` varchar(100) NOT NULL default '',
`datetime` date NOT NULL default '0000-00-00',
`server` tinyint(4) unsigned NOT NULL default '1',
KEY `datetime` (`datetime`),
KEY `datetimedn` (`domainname`(20),`datetime`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1

select domainname, sum(hits) as thits,server, datetime from dns_stats where domainname like '.moc.tsettsrif%' and datetime >= DATE_SUB(NOW(),INTERVAL 30 DAY) group by datetime,server order by datetime

and explain shows

1,SIMPLE,dns_stats,range,datetime,datetimedn,datetimedn,23,N,7925,Using where; Using temporary; Using filesort

This table gets updated each night, so is only read from during the day. The queries are almost exclusivly reads. I tried tests without the summary and grouping but this only takes about 10 seconds off, so I'm guessing almost all the time is spent retrieving the initial records.

Complicated Join Advice
My tables look something like this:

FIGHTER
fighter_id
first_name
last_name
age
style

FIGHTS
fight_id
fighter1_id (key to FIGHTERS table)
fighter2_id (key to FIGHTERS table)
event_id (key to EVENT table)
winner (this field will contain either 1 or 2 indicating which fighter won)
win_round
win_time

codes...

I've been trying to figure out an appropriate query but can't seem to get it. The only solution I've come up with so far is to break it into multiple SELECT statements which seems to defeat the purpose of a relational database.


Copyright © 2005-08 www.BigResource.com, All rights reserved