Author Topic: Composite keys or surrogate keys in a relational table  (Read 1636 times)

Offline 133794m3r

  • Level 22
  • *
  • Posts: 265
  • Reputation: +2/-0
    • View Profile
Composite keys or surrogate keys in a relational table
« on: December 19, 2010, 12:16:20 AM »
Right now, i've got it using surrogate keys. The system itself is for everything from logging battle data, to transactions. I'm currently using a surrogate key for each of the tables that need to have one. The places that i'd probably be able to do it is in most of the tables. For example, the chat table is as such.

Code: [Select]
chat_normal(
id int unsigned not null,
player_id mediumint unsigned not null,
message varchar(150) not null,
sent timestamp on update current_timestamp not null,
primary key(id),
index(player_id)
)engine=innodb;
then fo course player_id is linked to the players table. Now then, would it be better to do a primary key on both player_id, and the time sent? Or keep it the current way. Since it's physically impossible to send a message at the same time from a person it'd be impossible for that relation to not be unique. Also the system is setup so that it throws out any messages sent less than one second before.

The only reason for the indexing itself is because i want to be able to solve tickets involving abuse by other players and such other spam within chat, and thus it'd be primarly used for searching since a person only gets messages that happen whilst they're online and only the last 20 maximum so thus it could've jut been a simple sent message without any actual logging.

My question to anyone out there who's already dealt with such things, would it be better to do a surrogate key as i currently have with the id? Or would a composite primary key be best? And by best, i am talking about performance.

Offline JGadrow

  • Level 35
  • **
  • Posts: 1,133
  • Reputation: +23/-2
    • View Profile
Re: Composite keys or surrogate keys in a relational table
« Reply #1 on: December 19, 2010, 09:29:33 AM »
I'd keep the table structure the way that you have it. If you find yourself doing a lot of time-based index look-ups on this table, you can add an index to the timestamp, but single integer id columns are usually best.
Idiocy - Never underestimate the power of stupid people in large groups.


Offline 133794m3r

  • Level 22
  • *
  • Posts: 265
  • Reputation: +2/-0
    • View Profile
Re: Composite keys or surrogate keys in a relational table
« Reply #2 on: December 19, 2010, 12:34:37 PM »
ah ok, because i was looking at the mangos people's sql(which is a sloppy mess) and i was wondering why they had a multi-column index for their chat system and it led me onto 'composite keys' and then i read about how it was 'supposedly better' than the surrogate integer key that i was using for that table. And thus i came onto there to see what others who had actually had their databases used in something and not just well had them sitting there.

and that's good to know then, and that also answers one of the other things i was thinking about doing which was an index on time but then i didn't know if i'd need it. And since a lot of people have said to do a seperate table with the number of things in said table instead of doing a count or a get last row's id, should i do something similar with the chat system? Have a table that stores the 'last update' time for it so that i can know if i need to really get all messages since <last time that it was checked> Or would it just be faster to see the last row and it's timestamp.

Offline Chris

  • Game Owner
  • Level 35
  • *
  • Posts: 2,217
  • Reputation: +28/-1
    • View Profile
Re: Composite keys or surrogate keys in a relational table
« Reply #3 on: December 21, 2010, 05:54:08 AM »
Auto increment on id and index on timestamp. I don't see any reason for anything else for chat.

Offline aerosuidae

  • Level 9
  • *
  • Posts: 50
  • Reputation: +5/-0
    • View Profile
    • Return to Sol
Re: Composite keys or surrogate keys in a relational table
« Reply #4 on: January 02, 2011, 10:26:17 PM »
A belated +1 :) to the surrogate primary key, as JGadrow and Chris said.  It is just simpler.

But, since you said you're after performance, perhaps a little more info might be of interest:

1. You're using InnoDB which uses clustered indexes. This means the row data is stored along side the primary key behind the scenes, and every secondary index contains a copy of the primary key to link back to the record (in essence, all InnoDB secondary indexes are already a composite index covering the indexed fields and the primary key).

A composite primary key is generally larger than any surrogate, thus it makes all secondary indexes larger on disk and in memory which may reduce performance eventually, or cause you to hit the limits of your resources sooner.  +1 for surrogate primary keys :)

2. InnoDB in MySQL 5.0 has issues with AUTO_INCREMENT in that it uses a global mutex (lock) to protect the generation of unique auto_inc numbers.  If you have an active Chat system with lots of inserts, using auto_inc might be harmful and end up a bottle neck.  But, it would require a very active system... still, perhaps that is one reason why mango uses composite keys.

The problem is much reduced in the InnoDB plugin in MySQL 5.1 and 5.5 in any case.

3. You used mediumint for player_id.  InnoDB silently converts mediumint to int, so limiting your numeric range there is of no real benefit.

4. You might want to consider using MyISAM for a simple chat table.  Chat tables often use only INSERT / SELECT and never (or rarely) use UPDATE, so InnoDB's MVCC overhead doesn't help a whole lot.  Chat tables are a lot like logging/activity tables.  MyISAM + the concurrent_inserts setting will probably be faster and more compact.  

I expect most of this won't be that important until your game is massive :) $0.02
« Last Edit: January 02, 2011, 10:37:40 PM by aerosuidae »

Offline Chris

  • Game Owner
  • Level 35
  • *
  • Posts: 2,217
  • Reputation: +28/-1
    • View Profile
Re: Composite keys or surrogate keys in a relational table
« Reply #5 on: January 02, 2011, 11:11:07 PM »
3. You used mediumint for player_id.  InnoDB silently converts mediumint to int, so limiting your numeric range there is of no real benefit.
I always overlooked mediumint existence :D How is it handled in modern CPUs? Back then 24 bit values were useless (except for saving space) since everything was converted to 32 bit for performance (no real support for other values than 8,16,32 and maybe 64 bit). Is this correct nowadays too?




Offline aerosuidae

  • Level 9
  • *
  • Posts: 50
  • Reputation: +5/-0
    • View Profile
    • Return to Sol
Re: Composite keys or surrogate keys in a relational table
« Reply #6 on: January 03, 2011, 06:46:47 AM »
I always overlooked mediumint existence :D How is it handled in modern CPUs? Back then 24 bit values were useless (except for saving space) since everything was converted to 32 bit for performance (no real support for other values than 8,16,32 and maybe 64 bit). Is this correct nowadays too?

Depends on the hardware architecture. For venerable x86, the rules are still much the same as you probably remember. A 24 bit value will likely need a 32 bit register and a 32 bit aligned memory address anyway, so why not use the extra byte.

Offline codestryke

  • Administrator
  • Level 33
  • *****
  • Posts: 589
  • Reputation: +22/-0
    • View Profile
    • eXtremeCast Games
Re: Composite keys or surrogate keys in a relational table
« Reply #7 on: January 03, 2011, 09:38:37 AM »
4. You might want to consider using MyISAM for a simple chat table.  Chat tables often use only INSERT / SELECT and never (or rarely) use UPDATE, so InnoDB's MVCC overhead doesn't help a whole lot.  Chat tables are a lot like logging/activity tables.  MyISAM + the concurrent_inserts setting will probably be faster and more compact.  

I expect most of this won't be that important until your game is massive :) $0.02

The game doesn't have to be massive for a chat system to get out of control fast, it just has to be active. Hardware and config settings will go along way in how responsive your chat is. With our experiences everything above is the complete opposite. Our chat has gone from 2 people to over 100+ active chatters so we've had the ability to test and see various results.

myISAM's table locking even at moderate chat use causes the write locks to queue up and cause lag. myISAM may be faster but only until the locking queue starts slowing things down. Granted we switched over to INNODB when mySQL 4.x was just getting released so myISAM maybe a bit better handling its locks. However we are going on over 5 years with using INNODB without have to tweak anything which how we like things (set it and forget it).

Chat may be stored like a log/activity file but it sure isn't accessed like one and there in lies the problem when it comes to chat. Log/activity files are heavy writes and very little reads, chat is just as heavy on the reads as it is on writes so you have to think of it very differently then a log/activity file.

Creating online addictions, one game at a time:

Offline CygnusX

  • Level 24
  • *
  • Posts: 303
  • Reputation: +3/-2
    • View Profile
    • Lords of Midnight
Re: Composite keys or surrogate keys in a relational table
« Reply #8 on: January 03, 2011, 09:45:00 AM »
This may be a silly question, but why are you writing a chat application?  There are several IRC solutions for this that may be able to reduce your workload.

Offline Chris

  • Game Owner
  • Level 35
  • *
  • Posts: 2,217
  • Reputation: +28/-1
    • View Profile
Re: Composite keys or surrogate keys in a relational table
« Reply #9 on: January 03, 2011, 10:55:34 AM »
Chat may be stored like a log/activity file but it sure isn't accessed like one and there in lies the problem when it comes to chat. Log/activity files are heavy writes and very little reads, chat is just as heavy on the reads as it is on writes so you have to think of it very differently then a log/activity file.
Why not make a buffer table (one row, one text field only with all recent chat stored as one big HTML)? You make writes normally to separate rows on chat table and then it updates the buffer table, all users read only from buffer table, the read from chat table is done only during buffer update (when someone says something).

Offline codestryke

  • Administrator
  • Level 33
  • *****
  • Posts: 589
  • Reputation: +22/-0
    • View Profile
    • eXtremeCast Games
Re: Composite keys or surrogate keys in a relational table
« Reply #10 on: January 03, 2011, 01:10:27 PM »
This may be a silly question, but why are you writing a chat application?  There are several IRC solutions for this that may be able to reduce your workload.

We use to use an IRC server but there are many problems that come with using it. Our server company will not allow us to run an IRC daemon so we had to use a service. When you use a third party service you lose a lot of control, it's there OPs, it's there help system (if they have one), bot systems to keep control of a channel etc etc.

The other problem is immersion. I don't want ppl to leave the game, I want ppl in the game and I want the chat to broadcast things going on in the game. This keeps the "chat" focused to the game as it should be. I'm not saying chatters can't or shouldn't discuss other things in there life but the fact is the chat should be about the game, not about "chat" which is what often happens with IRC. Find just about any IRC server log in, find rooms that are based on topics and enter them. I'll bet less then 30% of the ppl in them are actually talking about the topic the channel is based on.

Finally we are running a business and you can put all the disclaimers you want up but if you offer a way to chat you are held responsible for what happens in that chat. With IRC unless you have a bot or something there is no way of viewing logs, if it's a PM between two ppl then you'll never get logs. Without logging you have no way of knowing what is going on or being said in chat. So what happens when a player threatens another player, what if someone comes in a claims it's you and they need that persons pw to check there account, what if you chat is flooded with adverts for some other game.

We've had all this and much more when we operated on IRC and I honestly can't see why ppl actually even consider IRC when it comes to BBGs.

Creating online addictions, one game at a time:

Offline aerosuidae

  • Level 9
  • *
  • Posts: 50
  • Reputation: +5/-0
    • View Profile
    • Return to Sol
Re: Composite keys or surrogate keys in a relational table
« Reply #11 on: January 03, 2011, 05:31:07 PM »
myISAM's table locking even at moderate chat use causes the write locks to queue up and cause lag. myISAM may be faster but only until the locking queue starts slowing things down. Granted we switched over to INNODB when mySQL 4.x was just getting released so myISAM maybe a bit better handling its locks. However we are going on over 5 years with using INNODB without have to tweak anything which how we like things (set it and forget it).

Fair enough, the options were more limited back then :) FWIW modern MyISAM allows concurrent inserts:

http://dev.mysql.com/doc/refman/5.0/en/concurrent-inserts.html
http://dev.mysql.com/doc/refman/5.0/en/server-system-variables.html#sysvar_concurrent_insert

This lets INSERT to write to the end of the table without locking (and blocking) SELECT.
« Last Edit: January 03, 2011, 05:58:18 PM by aerosuidae »

Offline aerosuidae

  • Level 9
  • *
  • Posts: 50
  • Reputation: +5/-0
    • View Profile
    • Return to Sol
Re: Composite keys or surrogate keys in a relational table
« Reply #12 on: January 03, 2011, 05:54:11 PM »
Why not make a buffer table (one row, one text field only with all recent chat stored as one big HTML)? You make writes normally to separate rows on chat table and then it updates the buffer table, all users read only from buffer table, the read from chat table is done only during buffer update (when someone says something).

This tactic is very effective for all sorts of things, not just chat. But, instead of the second table, use memcache for the buffer. Same idea.

http://php.net/manual/en/book.memcache.php

Offline Chris

  • Game Owner
  • Level 35
  • *
  • Posts: 2,217
  • Reputation: +28/-1
    • View Profile
Re: Composite keys or surrogate keys in a relational table
« Reply #13 on: January 04, 2011, 05:43:41 AM »
"As of MySQL 3.23.7 (3.23.25 for Windows), the MyISAM storage engine supports concurrent inserts to reduce contention between readers and writers for a given table: If a MyISAM table has no holes in the data file (deleted rows in the middle), an INSERT statement can be executed to add rows to the end of the table at the same time that SELECT statements are reading rows from the table. "

Seems this was supported in ancient MySQLs too.

What exactly "no holes in the data file" mean? If after delete we do OPTIMIZE TABLE do we still have holes or not?


And how exactly are rows stored in database? If I delete a row from the middle and add a new one (MyISAM) the new one will go in the place of old one, regardless of AUTO INCREMENT primaty key. After OPTIMIZE nothing changes (which is strange since it is supposed to recreate the table from scratch). Seems like it will never arrange the table in IDs order...

Offline aerosuidae

  • Level 9
  • *
  • Posts: 50
  • Reputation: +5/-0
    • View Profile
    • Return to Sol
Re: Composite keys or surrogate keys in a relational table
« Reply #14 on: January 04, 2011, 07:32:33 AM »
Seems this was supported in ancient MySQLs too.

Concurrent inserts have been around for some time, yep, but MySQL 5.x improved on previous versions with finer controls and some bug fixes for better scalability, particularly on SMP systems. Some distros and shared hosts also kept them turned off in older versions, due to bugs.

Obviously concurrent inserts are not one-size-fits-all :)  Depends on the schema and traffic. Other things too, like large indexes or many indexes on the table will start to introduce INSERT delay and locking problems anyway for MyISAM, concurrent inserts or not. Inserting to an ever expanding table will get slower no matter what. Some people switch to InnoDB; some archive old data to keep the table and indexes small; some implement caching at a higher level. All valid solutions.

What exactly "no holes in the data file" mean? If after delete we do OPTIMIZE TABLE do we still have holes or not?

It means no data file fragmentation. MyISAM with variable length records can leave gaps in the data file when old records are deleted and new records don't perfectly fit said gaps. When inserting, normally MyISAM will try to fill any gaps before extending the table. concurrent_insert=2 forces INSERT to ignore data file gaps unless no SELECTs are running.

And how exactly are rows stored in database? If I delete a row from the middle and add a new one (MyISAM) the new one will go in the place of old one, regardless of AUTO INCREMENT primaty key. After OPTIMIZE nothing changes (which is strange since it is supposed to recreate the table from scratch). Seems like it will never arrange the table in IDs order...

For MyISAM tables with variable length records, the new row you insert may not fit in the old gap.  For tables with fixed length records, yes, the gaps will be filled first. (Actually it is a bit more complex than this, because MyISAM can also split records into parts sometimes too.)

OPTIMIZE for MyISAM does not necessarily rebuild the entire table. It only defragments split records in .MYD file (which could still leave some gaps), sorts the index pages in .MYI file and recalculates index cardinality values. OPTIMIZE for InnoDB does rebuild the entire table, because it simply mapped to ALTER TABLE underneath.
« Last Edit: January 04, 2011, 07:35:37 AM by aerosuidae »

 


SimplePortal 2.3.3 © 2008-2010, SimplePortal