Search This Blog

Monday, December 27, 2010

Connecting to MySQL From PHP

Okay, let's get back to PHP, and look at how you can tap into all this database power from your dynamic Web pages.
If you're feeling a little overwhelmed after going through all those SQL statements, don't worry! It's not essential that you understand every last detail at this stage. You just want to get an overall feel for how you go about interacting with a relational database such as MySQL. You'll become a lot more familiar with the specifics as you move on and start building applications for yourself.
Just as you used client programs to access the MySQL database server from the command line, PHP needs some client code of its own to talk to MySQL.


In the days of PHP4, this was a no-brainer because the necessary code was integrated right into the depths of PHP itself. Unfortunately that's no longer the case, and PHP5 expects the MySQL client libraries to be present on your system before compiling with PHP support. This is quite easy for Linux users: just ensure that you compile PHP5 using the -with-mysql option.


For Windows users, it's a little more complex. You need to:
  1. Ensure that the libmysql.dll library is in your system root path (libmysqli for versions of MySQL 4.1 and later).
  2. Uncomment the mysql extension in the extensions section of your php.ini file.
  3. Ensure that your php_mysql.dll extension is in a folder where PHP can find it. (This is set in the extension directory setting in the php.ini file.)
  4. Restart your Web server after you've implemented these changes.
That's it! Nothing too complicated—it's just like using any other extension in PHP. With everything set up, let's look at what you need to do to get PHP5 and MySQL talking to each other.

Monday, December 20, 2010

A Quick Play with MySQL

You'll start working with the MySQL server using the client program mysql and a few simple database queries. Along the way, you'll play with some database settings, create some new users (without quite as much access to everything as the root user has), and get a taste of how queries work in practice.

Starting the mysql Client Program

Fire up the mysql client by issuing the following command at your command prompt:


> mysql -uUSER -pPASSWORD -hHOST

Replace the USER, PASSWORD, and HOST arguments to reflect your personal settings. For example, if your database username and password are phpuser and phppass and you're connecting to the host db.whatever.com, the command would look like this:


> mysql -uphpuser -pphppass -hdb.whatever.com

All the arguments are optional. If missing, the following values are assumed for them:


Argument
Value
-u
Your shell account username
-P
No password
-h
localhost


The mysql client now connects to the database server running on the specified HOST with the given user ID/password combination. You may also specify a database to use by providing its name at the end of the command, like this:


> mysql -uUSER -pPASSWORD test

You should see a response similar to the following:


Welcome to the MySQL monitor.  Commands end with; or \ g.
Your MySQL connection id is 4 to server version: 4.0.18-nt

Type 'help;' or '\ h' for help. Type '\ c' to clear the buffer.

mysql>

If the mysql client complains that it can't connect to the specified server, check to see if the provided arguments are correct.

Selecting a Database to Use

To see the list of databases available, use the SHOW DATABASES command:


mysql> SHOW DATABASES;
+---------------+
| Database      |
+---------------+
| mysql         |
| test          |
+---------------+
2 rows in set (0.00 sec)

This is mostly the same information you got from mysqlshow earlier: there are still two databases, mysql and test, available in the system. This time, though, you're also told how many rows were returned (2) and how long it took the server to execute the query (0.00 seconds to two decimal places).


In fact, you can use the mysql client to do just about anything that the other, more specialized client programs can, as long as you know the correct syntax. Let's dig a little deeper.


To select a particular database, you can utilize the USE databasename syntax. Let's have a poke around inside one of the databases:


mysql> USE mysql;
Database changed

Your MySQL server is now ready to work with the database mysql. A word to the wise: be careful how you use the user table; it contains important information that you probably don't want to delete.
For the sake of readability, it is recommended that SQL keywords are written in uppercase characters, and user-defined names such as table and field names are lowercase. Remember that on the Linux/UNIX platforms, arguments are case-sensitive: Mysql, MYSQL, and mysql are all different database names.

Looking at Tables Inside a Database

You can use the SHOW TABLES command to list the existing tables in the currently selected database:


mysql> SHOW TABLES;
+-----------------+
| Tables in mysql |
+-----------------+
| columns_priv
| db              |
| func            |
| host            |
| tables_priv
| user            |
+-----------------+
6 rows in set (0.00 sec)

Once again, a slight elaboration on the results you previously got from mysqlshow. This time you can have a look at what's going on inside the tables. For example, let's see how the user table is structured. Use the DESCRIBE command, or DESC for short:


mysql> DESC user;
+--------------------+---------------------+------+-----+---------+-------+
| Field              | Type                | Null | Key | Default | Extra |
+--------------------+---------------------+------+-----+---------+-------+
| Host               | varchar(60) binary  |      | PRI |         |       |
| User               | varchar(16) binary  |      | PRI |         |       |
| password           | varchar(16)         |      |     |         |       |
| Select_priv        | enum('N','Y')       |      |     | N       |       |
| Insert_priv        | enum('N','Y')       |      |     | N       |       |
| Update_priv        | enum('N','Y')       |      |     | N       |       |
| Delete_priv        | enum('N','Y')       |      |     | N       |       |
| Create_priv        | enum('N','Y')       |      |     | N       |       |
| Drop_priv          | enum('N','Y')       |      |     | N       |       |
| Reload_priv        | enum('N','Y')       |      |     | N       |       |
| Shutdown_priv      | enum('N','Y')       |      |     | N       |       |
| Process_priv       | enum('N','Y')       |      |     | N       |       |
...
...
| File_priv          | enum('N','Y')       |      |     | N       |       |
| Grant_priv         | enum('N','Y')       |      |     | N       |       |
| References_priv    | enum('N','Y')       |      |     | N       |       |
| max_connections    | int(11) unsigned    |      |     | 0       |       |
+--------------------+---------------------+------+-----+---------+-------+
31 rows in set (0.00 sec)

This command describes the user table's structure—what fields are defined and how they've been configured. For example, you can see the Host field can hold up to 60 characters and is defined as a primary key. The User field also is defined as a primary key. That doesn't mean that the Host and User fields are both primary keys (you should remember that this is forbidden). Instead, the combination of the Host and User fields works as a sole primary key in Host-User format.


For example, if user john at localhost has been given the server access privileges, localhost-john becomes the primary key value for his record. Another user at localhost whose name is also john cannot be inserted as a record in this table. However, john at whatever.com can.
All the fields in the user table except the Host, User, Password, are declared as ENUM('N', 'Y'). This means that only one from the specified set of values (either N or Y in this case) can be used in each field. Note that N is the default value when no value is provided.

Using SQL to Look at Data

All the queries you've given to the mysql client so far have been specific to the MySQL database engine, and won't necessarily work with another system, whether or not it supports SQL. Now that you're working down at the level of actual data, though, it's time to use some bona fide SQL statements to find out what's there.


Let's say you want to find out if there are any users registered with access privileges from the local machine localhost. You might do this by issuing the following SELECT query:


mysql> SELECT User,Host FROM user;
+----------+-----------+
| User     | Host      +
+----------+-----------+
| root     |localhost  +
| dodge    |doggiesr   +
| james    |digistrawb +
+------------—---------+
3 rows in set (0.00 sec)


Because there are only three rows, it's not hard to see that only root is currently registered as a local user. Of course, it might be somewhat harder to spot this if you had a few hundred more users registered in the database, so here's another way to get the same result:


mysql> SELECT User FROM user WHERE Host='localhost';
+----------+
| User     |
+----------+
| root     |
+----------+
1 row in set (0.00 sec)

Using a WHERE clause narrows the scope of the retrieved records. This works much the same as PHP's if statement, except that the former uses a single equal sign (=) whereas the latter uses two (==) to test for equality.


If you wanted the server to retrieve all fields available instead of a few specified ones, you could use the * wildcard. For example:


mysql> SELECT * FROM tablename;

This command would retrieve all of the records from tablename.

Manipulating Data in a Database

Now let's create a new database user, by inserting a new record into the user table using the INSERT command:


mysql> INSERT INTO user VALUES(
    -> 'localhost',
    -> 'phpuser',
    -> Password('phppass'),
    -> 'N', 'N', 'N', 'N', 'N', 'N', 'N',
    -> 'N', 'N', 'N', 'N', 'N', 'N', 'N',
    -> 'N', 'N', 'N', 'N', 'N', 'N', 'N',
    -> 'N', 'N', 'N', 'N', 'N', 'N', 'N');
Query OK, 1 row affected (0.00 sec)

This INSERT query creates a record for the user phpuser, using the password phppass, in the user table, and specifies no access privileges. Each string value is placed within single quotes—this issue is discussed in greater detail later on. As you can see, the mysql client reports that the query was successfully executed, and that one row of data has been inserted as requested.


MySQL saves database user passwords after encrypting them. It uses its own password encryption scheme (which is different than Linux's), so use MySQL's built-in password() function to encrypt your password.


What if you change your mind and want to give user phpuser (who presently has no access privileges at all) the server administration privileges Reload_priv and Shutdown_priv? You could update the user table to reflect your whimsical decision using an UPDATE query, like this:


mysql> UPDATE user SET Reload_priv='Y', Shutdown_priv='Y'
    ->  WHERE User='phpuser';
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0

The mysql client reports that only one row matched the condition set in the query and that it has been successfully updated. User phpuser now has server administration privileges only: phpuser is only allowed to reload or shut down the server.


The WHERE clause is optional, and when omitted, the UPDATE query changes all the records in a given table with the new values provided. Warning: Make sure you don't change records by accident. A nightmare can sometimes turn into reality if you're not careful because a careless UPDATE query could, for example, change 10,000 user records to have the same password. Even worse, all the records in a table could unwittingly be deleted with a DELETE query such as:


mysql> DELETE FROM test;
Query OK, 0 rows affected (0.00 sec)

Here all of the records in the table test have been deleted. Don't be fooled by the mysql client reporting that no row is affected by this query. Following the query, the server has no way of knowing how many rows are affected (deleted) because they're gone forever after the query. Note that the syntax of this example DELETE command contains no clauses. Thankfully, you may specify which records are to be deleted with a WHERE clause:


DELETE FROM tablename WHERE condition(s);

Now you change you mind again and want to give user phpuser all of the server administration privileges. What do you do? You can make either another UPDATE query or you can REPLACE the whole record. The syntax is pretty simple:


mysql> REPLACE INTO user VALUES(
    -> 'localhost', 'phpuser', Password('phppass'),
    -> 'Y', 'Y', 'Y', 'Y', 'Y', 'Y', 'Y',
    -> 'Y', 'Y', 'Y', 'Y', 'Y', 'Y', 'Y',
    -> 'Y', 'Y', 'Y', 'Y', 'Y', 'Y', 'Y',
    -> 'Y', 'Y', 'Y', 'Y', 'Y', 'Y', 'Y');
Query OK, 1 row affected (0.00 sec)

So REPLACE overwrites an old record with a new one. Note that the new record and the old record must have the same value in a field designated as a key (or the combined key formed by the Host and User fields in this case). The difference between the UPDATE and REPLACE commands is that UPDATE replaces only the selected set of fields in a record, whereas REPLACE replaces the whole record with the given values.


Finally, to activate the newly created account phpuser, you need to flush privileges, reloading the privilege information from the table. (The server normally reads the access privilege information only once, when it's loaded.) First, exit mysql client, by typing quit, (otherwise you'll get an error saying that your SQL syntax is incorrect). Then:


> mysqladmin -uUSER -pPASSWORD -hHOST flush-privileges

Alternately you can issue the FLUSH PRIVILEGES command to reflect the change in the mysql client program:


mysql> FLUSH PRIVILEGES;

Using GRANT and REVOKE Commands

GRANT and REVOKE commands enable you to allocate and remove database privileges as you choose. They can be used at various levels of your database permissions structure, going from global to database to table to column level—depending on how fine grain you want your control to be. Of course, for whatever permissions you can grant a user, you can use a corresponding REVOKE statement to remove them.


In the following sections you'll see how to use these commands to control permissions on your tables. For example, earlier you created a new user by manipulating the user table directly. You can achieve the same goal more simply using the GRANT command. Let's see how.

GRANT

Here's the simplest form of the GRANT command:


mysql> GRANT ALL PRIVILEGES ON *.* TO
    -> phpuser@localhost IDENTIFIED BY 'phppass';

It gives every access privilege on every database in the system to the user phpuser at localhost if he logs on to the server using the password phppass. Note that the user and host arguments are not given in quotes.


If you want to let this user access the tables in the test database only, you'd use the following command:


mysql> GRANT ALL PRIVILEGES ON test.*
    -> TO phpuser@localhost IDENTIFIED BY 'phppass';

Here, test.* denotes all tables in the test database. Similarly you can grant access to only the sample table in test:


mysql> GRANT ALL PRIVILEGES ON test.sample
    -> TO phpuser@localhost IDENTIFIED BY 'phppass';

By replacing the ALL PRIVILEGES keyword with a selection of query types, a set of access privileges can be granted:


mysql> GRANT SELECT,INSERT,UPDATE ON test.*
    -> TO phpuser@% IDENTIFIED BY 'phppass';

With this command, the user phpuser can issue only SELECT, INSERT, and UPDATE queries to any of the tables in the test database. Any other query, such as DELETE, won't be allowed. Wildcards like * are extremely useful. For example, by replacing localhost with the % wild card, you can allow the user phpuser to access the specified tables from any server.


You can also specify the host with a partial domain name: if you used phpuser@%.whatever.com for instance, only users with the user ID phpuser connecting from the domain whatever.com would be granted access to the server.


You can split hairs even further, by specifying the fields that the user can access:


mysql> GRANT SELECT (User, Host) ON mysql.user
    -> TO phpuser@localhost IDENTIFIED BY 'phppass';

Here the user phpuser can issue only SELECT queries on the User and Host fields in the user table.


You can use the WITH GRANT OPTION clause to give the specified user the capability to grant other users any privileges he has at the specified privilege level:


mysql> GRANT ALL PRIVILEGES ON test.*
    -> TO phpuser@localhost IDENTIFIED BY 'phppass' WITH GRANT OPTION;

This command is equivalent to creating another superuser, called phpuser. Be careful with the WITH GRANT OPTION clause: two users with different privileges can easily team up and extend their privileges by exchanging them!

REVOKE

The REVOKE command removes access privileges from a user. If you want to revoke all privileges given to user phpuser, for example, issue the following command:


mysql> REVOKE ALL PRIVILEGES ON *.* FROM phpuser;

You can specify multiple usernames by separating them with commas:


mysql> REVOKE ALL PRIVILEGES ON *.*
     > FROM phpuser@localhost, phpuser2, phpuser3;

All specified users must exist, and possess the specified privileges, in order for this command to work.


The following command revokes only the SELECT privilege from the user phpuser:


mysql> REVOKE SELECT ON *.* FROM phpuser@localhost;

Again, you can split hairs by specifying field names:


mysql> REVOKE SELECT (User, Host) ON mysql.user FROM phpuser@localhost;

If you modify the grant tables using GRANT or REVOKE commands, the changes take effect immediately. You don't have to flush privileges or reload the server.

Monday, December 13, 2010

Relational Databases

One key characteristic of databases that you haven't looked at so far is how a database organizes data for users to interact with. Whenever you use the database engine to retrieve, modify, or add new data, you need to know what to ask for—moreover, when you come to design a database for yourself, you need to tell the database engine what sort of data to expect where, and how it's all meant to hang together.


Most of the databases you considered earlier use what's known as a relational data model. As such, they're known as relational databases (or Relational Database Management Systems—RDBMS). They arrange data into tables, each of which is divided into rows and columns.


In database parlance, each row in a table (apart from the heading) represents a data record: a set of intrinsically connected pieces of data. Likewise, each column represents a field: a specific type of data that has the same significance for each record in the table.
For all practical purposes, the term row is synonymous with record, whereas column is synonymous with field. This is useful to bear in mind when visualizing tables.
Suppose that the manager of a touch-rugby team (don't worry if you aren't familiar with touch rugby —the sport is incidental to the topic and won't affect your understanding of databases in any way) sets up a database so that he can track the matches in which his players compete. Naively, he asks each player to enter his details into the database after each match. After the second round of matches, the manager's table looks like this:


Player_Number
Name
Phone_number
Date_Played
Nickname
42
David
555-1234
03/03/04
Dodge
6
Nic
555-3456
03/03/04
Obi-d
2
David
555-6543
03/03/04
Witblitz
14
Mark
555-1213
03/03/04
Greeny
2
David
555-6543
02/25/04
Witblitz
25
Pads
555-9101
02/25/04
Pads
6
Nic
555-3456
02/25/04
Obi-d
7
Nic
555-5678
02/25/04
Nicrot


He soon realizes that this is going to make for a huge table after everyone on the team has played an entire season's worth of games. As you can see, the structure of the table is inefficient because each player's details—number, name, phone number, and so on—are entered every time he plays a match.


Such redundancy is undesirable in a database. For example, say that the player with the number 6 keeps dropping the ball, and his teammates decide to give him a new nickname (which won't be mentioned here). To update the table, every one of this player's records would have to be modified to reflect his new nickname.


In addition, every time a player enters his details after a match, all of that information is consuming valuable space on the hard drive. Redundancy is terribly inefficient, wasting a great deal of time and space.


Fortunately, in the early 1970s, Dr. E. F. Codd came up with a unique and powerful way to alleviate this type of problem. He stipulated a set of rules that, when applied to data, ensure that your database is well designed. In truth, he stipulated no small number of rules, but for most purposes you need only concern yourself with the first few. These requirements can be divided up into what are called normal forms, and making sure your data complies with these normal forms goes a long way to ensuring good relational database design. Let's take a look at what normalization is all about.

Normalization

Normalization is defined as "the process of breaking up the data into several tables, so as to minimize the number of times you have to repeat the same data." The normal forms are degrees of normalization, and they are governed by an elegant set of rules that can be summarized as follows:

First Normal Form (1NF)

  • Create a new table for each new set of related data you want to store.
  • Eliminate repeating information in an individual table.
  • Uniquely identify each record with a primary key.
For example, in 1NF you could split the preceding table into a player information table:


Player_Id
Name
Phone_number
Nickname
42
David
555-1234
Dodge
6
Nic
555-3456
Obi-d
14
Mark
555-1213
Greeny
2
David
555-6543
Witblitz
25
Pads
555-9101
Pads
6
Nic
555-3456
Obi-d
7
Nic
555-5678
Nicrot


And match log table:


Player_Id
Date_Played
42
03/03/04
6
03/03/04
2
03/03/04
14
03/03/04
2
25/02/04
25
25/02/04
6
25/02/04
7
25/02/04


Notice how the original table splits naturally into these two new tables. This is because the original table provided data about two distinct things (or entities): players and matches. Each new table contains data concerning just one entity. The split into entity tables is an important part of the normalization process.


Now you can take another step toward conforming to the first normal form: eliminating repeating information in an individual table. There's some redundant information in the player information table that can be removed to give you the following:


Player_Id
Name
Phone_number
Nickname
42
David
555-1234
Dodge
6
Nic
555-3456
Obi-d
2
David
555—6543
Witblitz
14
Mark
555-1213
Greeny
25
Pads
555-9101
Pads
7
Nic
555-5678
Nicrot


Are you done with this table yet? There's no longer any duplicated information, and the table represents only related information (information about the player entities). But what about a unique identifier? Each entity must have (at least) one unique field—in other words, a field in which no entries are repeated. Because of the exclusive nature of this field, each entry in it uniquely identifies (IDs) each record in the table. The field used to ID records is often known as the primary key. Only one primary key is permitted per table.
Remember, by moving the player information to one table and the match log entries to another, you simplify the process of modifying a player's details—you only have to modify one record in the player table.
You could add a unique key, which would assign a unique number to each record, but let's look at the information closely. In any given sport, a player's number must be unique, so you know that all the values in the Player_Id field are unique. But the Nickname field is also unique because each player can only have one nickname at a time. Which one should you choose? You've already seen that nicknames can change, based on performance, so the Player_Id field is the sensible choice for a primary key.


What about the match log table? Unfortunately, the Player_Id field is no longer unique because each player plays on many dates. Obviously, the date field can't be unique either, so you can't make a unique identifier out of either field. In this case, you need to add one yourself (you'll learn how later in the book). Once that's done, both tables are in first normal form. Adding an ID field could have hidden uses, especially if you wanted to join the information in the two fields of the match log table to create a unique key that you could relate to some other data (the number of tries scored per player per match, for example).
The second normal form has two goals, which are applicable only after you've achieved 1NF.

Second Normal Form (2NF)

  • Create new tables for each new group of values that apply to many records.
  • Relate new tables to existing tables with a foreign key (a key that identifies records in a different table).
Because both example tables contain a common field (Player_Id), whose values correspond between tables, you have a way to join them together—you don't actually need a foreign key to accomplish this, but there are definitely advantages to using foreign keys (For a more in-depth discussion on them, please go to www.mysql.com/doc/en/ANSI_diff_Foreign_Keys.html for a discussion on MySQL foreign keys.) This means that records in one table have a well-defined relationship to records in the other. The database now has all the information it needs to answer complex queries involving both tables, which it couldn't answer by looking at either table in isolation. With tables related like this, the database can truly be termed relational. The specific relationship described here is also called a one-to-many relationship. For each record in the player information table, you may have many records in the match log table. The ability to construct meaningful one-to-many relationships is a good indicator that your database meets the requirements of the second normal form.

Third Normal Form (3NF)

The third normal form's goal is to eliminate fields that do not depend on the primary key.
This is where the real world and the elegance of theory can collide. To some extent, it reverses the premise of a master table relationship by suggesting that anytime you may have data repeat in a record such as ZIP codes in a list of addresses, that data should be stored in a separate table and linked back to the individual records that use them with a new keyed relationship. Make no mistake; the purists will chastise you roundly for not doing it this way. But the truth is, database design is often a series of trade-offs; and you do sometimes have to weigh design idealism against a novel but useful concept—simplicity. The requirements of the third normal form simply may not satisfy your needs; and it's all right for you to decide on a case-by-case basis what will work best for any given situation.

Other Normal Forms

Believe it or not, there are more normal forms, the fourth and fifth among them. The fourth normal form states that you should isolate independent multiple relationships. This rule primarily seeks to deal with the problems of many-to-many relationships, where many records may have relationships to intersecting records in another table that may in turn link back to multiple records in the first table. Confused? Don't feel bad. This is where things get pretty hairy and you don't need to hold yourself accountable to this standard. The concept is placed here because you may very well encounter this situation down the road and it's useful as a strong suggestion for dealing with these issues.

Talking to Databases with SQL

At this point, you probably aren't at all surprised to learn that MySQL itself is a relational database. In fact, you're about to start applying the principles we've just been discussing with your very own database engine. First, though, you need to learn a little about the language used to interact with relational databases—a language called SQL.


SQL, the Structured Query Language, is a standard command set used to communicate with a relational database management system on any given platform. Every task such as creating databases or tables, as well as saving, retrieving, deleting, and updating data from databases, is done via SQL statements. Implementation of SQL features vary among RDBMS vendors, but because the basic concepts are identical, applying skills learned on one platform to another should be no more difficult than porting a computer program written for one platform to another using the same language. In this section you'll examine some basic features of SQL: data types, indexes, keys, and queries.
There are some differences in the way MySQL implements certain things and the ANSI SQL standard. You can check out a list ofthese differences at www.mysql.com/doc/en/Differences_from_ANSI.html.

SQL Data Types

When you create a database table, the type and size of each field must be defined. A field is similar to a PHP variable except that you can store only the specified type and size of data in a given field. Therefore, unlike PHP variables, you can't insert characters into an integer field, for example. The three usual sets of data types are supported in MySQL: numeric, date/time, and characters, which are described in the following tables:


Numeric Data Types
Description
Range/Format
INT
Normal-sized integer
(-231 to 231 -1), or (0 to 232 -1) if UNSIGNED
TINYINT
Very small integer
(-27 to 27 -1), or (0 to 28 -1) if UNSIGNED
SMALLINT
Small integer
( -215 to 215 -1), or (0 to 28 -1) if UNSIGNED
MEDIUMINT
Medium-sized integer
(-223 to 223 -1), or (0 to 224 • -1) if UNSIGNED
BIGINT
Large integer
(-263 to 263 -1), or (0 to 264 -1) if UNSIGNED
FLOAT
Single-precision floating-point number
Minimum non-zero ±1.176 × 10 - 38; maximum non-zero ±3.403 × 10 + 38
DOUBLE/REAL
Double-precision floating-point number
Minimum non-zero ±2.225 × 10 - 308; maximum non-zero ±1.798 × 10 + 308
DECIMAL
Float stored as string
Maximum range same as DOUBLE
Date/Time Data Types
Description
Range/Format
DATE
Date
YYYY-MM-DD format. Range 1000-01-01 to 9999-12-31
DATETIME
Date and time
YYYY-MM-DD hh:mm:ss format. Range 1000-01-01 00:00:00 to 9999-12-31 23:59:59
TIMESTAMP
Timestamp
YYYYMMDDhhmmss format. Range 19700101000000 to sometime in 2037
TIME
Time
hh:mm:ss format. Range -838:59:59 to 838:59:59
YEAR
Year
YYYY format. Range 1900 to 2155
Character Data Types
Description
Range/Format
CHAR
Fixed-length string
0–255 characters
VARCHAR
Variable-length string
0–255 characters
BLOB
Binary Large Object
Binary data 0-65535 bytes long
TINYBLOB
Small BLOB value
Binary data 0–255 bytes long
MEDIUMBLOB
Medium-sized BLOB
Binary data 0-16777215 bytes long
LONGBLOB
Large BLOB value
Binary data 0-4294967295 bytes long
TEXT
Normal-sized text field
0-65535 bytes
TINYTEXT
Small text field
0–255 bytes
MEDIUMTEXT
Medium-sized text
0-16777215 bytes
LONGTEXT
Large text field
0-4294967295 bytes
ENUM
Enumeration
Column values are assigned one value from a set list
SET
Set value(s)
Column values are assigned zero or more values from a set list


The difference between a CHAR and VARCHAR type field is that the former stores a fixed-length value no matter how short it may be, whereas the latter stores exactly as many bytes as necessary to keep a given value. Suppose you insert the string dodge into the fields defined as char_field CHAR(10) and varchar_field(10), for example. They will store the same string slightly differently:


char_field: 'dodge     '// five blank spaces are right padded
varchar_field: 'dodge' // no space

It follows that declaring character fields as VARCHAR type will save you some disk space. Don't be tempted to use VARCHAR type fields for storing every string, though, because that has drawbacks, too. The MySQL server processes CHAR type fields much faster than VARCHAR type, for one thing, because their length is predetermined. If your strings don't vary in length much, you're better off using CHAR type fields. Moreover, when your strings are all the same length, VARCHAR takes up more disk space, because it has to store the length of each string in one additional byte.


A final note on VARCHAR/CHAR type fields: if your table has at least one VARCHAR field, all character fields are converted to VARCHAR type even if you define them otherwise.

Indexes and Keys

Inexperienced database designers sometimes complain about their database engines being slow, a problem that's often explained by the lack of an index. An index is a separate sorted list of a selected field (or fields) in a table. To explain why indexing a table has a dramatic effect on database performance, first consider a table without indexes. Such a table is basically the same as a plain text file because the database engine must search it sequentially. Rows in a relational database are not inserted in any particular order—the server inserts them in an arbitrary manner. To make sure it finds all entries matching the information you want, the engine must perform a full table scan, which is slow and inefficient, particularly if there are only a few matches.


Now consider an indexed table. Instead of moving straight to the table, the engine can scan the index for items that match your requirements. Because the index is a sorted list, this scan can be performed very quickly. The index guides the engine to the relevant matches on the database table, and a full table scan is not necessary.
So why not just sort the table itself? This might be practical if you knew that there was only one field on which you might want to search. However, this is rarely the case. Because it's not possible to sort a database by several fields at once, the best option is to use an index, which is separate from the table.


What about the case of searching multiple tables at once? This is where you really benefit from an index. Searching for a possible match across joined databases without indexes is a terrible idea—the engine would have to check all possible combinations of rows in one table with those in another. For two tables each with 500 rows, this would be 500 times 500, or 250,000 combinations! Indexing speeds searches up dramatically. The engine checks the index of the first table to find the position of matches to the first part of the query, and then it uses the index to the second table to find matches to the second part of the query. In other words, you pull out the relevant records directly.


A primary key is a special index that, as you saw at the beginning of the chapter, is used to ID records and to relate tables to one another, providing the relational database model. Each related table must have one (and only one) primary key.


Indexes and primary keys can be derived from combinations of fields. For a key to be formed in this way, the combination of items from each field must still be unique.


Because an index brings about a significant boost in performance, you could create as many indexes as possible for maximum performance gain, right? Not always. An index is a sure-fire way to increase the speed of searching and retrieving data from a database, but sacrifices performance when saving or updating records, and also increases the size of a table. Why? When you insert a record into an indexed table, the database engine has to record its position in the corresponding index table. Do the math!


What's more, if you have more than one index on a table, multiple write operations have to be performed on the index table, too. So when creating indexes on a table, don't create more than you need. Limit indexed columns to those that will be searched or sorted frequently. If required, you can create additional indexes on a table as you need them to increase performance.

Queries

SQL statements or commands are used to construct queries. Queries are the questions your application asks a database engine, which then returns the records that meet the criteria specified in the query. Queries return an array of records that meet the specified conditions, and contain information from selected fields. PHP, and other languages that support database connectivity for that matter, can treat the returned array as a normal array variable. We'll return to this topic later; for now, just bear it in mind. The returned array of records, called the result set, is the database engine's answer to your query. If you ask the engine to retrieve the records containing John as the first name, for example, it returns all of the records conforming to the query. If none are found, NULL (discussed in the next section) is returned.
Some SQL commands are literally commands that tell the database engine to do something instead of asking for an answer: "Delete those rows of information that contain John as the first name!" SQL commands of this type don't return a result set.
NULL
Consider the following scenarios:


A class takes a spelling test at school, but the teacher has yet to mark the test sheets and insert the results into a database table. It would be unfair of the teacher to insert default values into the results column before the sheets have been marked because a default score is irrelevant in this context. What could you place in the column to signify that you are awaiting data?


You want to construct a database table containing information about endangered bird species. One of the fields provides the fastest recorded flight speed for each bird in the table. You begin to create records for endangered penguins, and then you remember—penguins can't fly! What do you place in the flight speed column to indicate that the field is not applicable to penguins?


In both of these cases data is missing from a table. The only difference is that, in the first case, the situation is temporary because the teacher will soon add the missing test results to the table, but in the second case the values for the flight speed of penguins is impossible to obtain. You need a way to represent missing data in fields.


In a MySQL table, a NULL entry represents a missing value. NULL doesn't belong to any particular data type, but it can replace any value. Because it is not a data type or value, but it can be a field entry, the concept of a NULL is often difficult to grasp for beginners and experienced programmers alike. Programmers often have a mistaken idea of NULL. For example, a common mistake is to think of NULL as zero, which is wrong because zero is a value; NULL is not. Strings filled with one or more blank spaces, and strings of zero length, may also be mistaken for NULL because string is a data type, but NULL isn't. NULL is nothing, no data type, no value.


So what happens if the result set from one of your queries contains a NULL, and that result set is then used in your program in subsequent calculations? The rule of thumb for math with NULL is that it propagates. Any arithmetic operation involving a NULL returns NULL. This makes sense because how could you provide results when all the data needed to perform the calculation is not present? This also applies to dividing a NULL value by zero; NULL is returned.
Query Commands
MySQL queries issued to manipulate data in a table can be constructed using the following main commands:
  • SELECT: Retrieves data from a database.
  • DELETE: Deletes data from a database.
  • INSERT—Inserts data into a database.
  • REPLACE: Replaces data in a database. If the same record exists in a table, the command overwrites the record with the new data.
  • UPDATE: Updates data in a table.
The rest of the command set involves creating or modifying the database structures, rather than the data stored in the databanse:
  • CREATE: Creates a database, table or index.
  • ALTER: Modifies the structure of a table.
  • DROP: Wipes out a database or table.
You'll see more of these commands as you work through the next few chapters. Just to give you a taste though, let's take a look at the typical form of a MySQL SELECT query, which retrieves records from a table:


mysql> SELECT field1, field2, ... , fieldn FROM table WHERE condition;

The first thing to note is that each query is terminated with a semicolon, just as a PHP statement is. A query statement may expand to multiple lines. The following, slightly more specific query, is essentially the same as this generic case:


mysql> SELECT last_name, first_name
    -> FROM user
    -> WHERE first_name = 'John'

Take a closer look at the FROM, WHERE, and ORDER BY clauses in the query. The query returns any record from the user table where the value of the first_name field is John. Assuming you have a table called user, here's a sample of the query's output:


Simpleton John
Smith John
Thomas John

Now let's put this into practice and have some fun with MySQL!