Fire up the mysql client again, and let's take a closer look at the SELECT command. Remember that everything covered here is totally applicable to the preceding PHP examples—the table you see at the command line is simply a representation of the result set with which your PHP script ultimately has to deal.
Server Functions
Let's begin by going back to basics, issuing SELECT queries that don't require a database table at all. MySQL has a lot of very useful built-in server functions. Want to know what time it is? Don't have a watch? Ask the server:
mysql> SELECT now(); +--------------------------+ | now() | +--------------------------+ | 2000-08-03 16:56:03 | +--------------------------+ 1 row in set (0.00 sec)
The now() function returns the current time of the system on which the server is running. You can retrieve the current date and time separately using the curdate() and curtime() functions:
mysql> SELECT curdate(), curtime();
+-------------+-----------+
| curdate() | curtime() |
+-------------+-----------+
| 2004-03-08 | 16:57:19|
+-------------+-----------+
1 row in set (0.00 sec)
Many of MySQL's built-in functions are very similar to PHP's in both appearance and functionality. However, there are certain subtle differences. For example, MySQL's substring() function works much the same as PHP's substr(), but the former returns a substring counting from 1 whereas the latter counts from 0:
mysql> SELECT substring('test',1,1);
+------------------------+
| substring('test', 1,1) |
+------------------------+
| t |
+------------------------+
1 row in set (0.54 sec)
The equivalent call to PHP's substr() function would be:
$substring = substr('test', 0, 1);
Retrieving Fields
You've already seen how to retrieve data from a database table. If you want to look at the contents of the field userid in table user, you just say:
mysql> SELECT userid FROM user;
+--------+
| userid |
+--------+
| Dodge |
| Greeny |
| Mac |
| Nicrot |
| Pads |
| Spargy |
+--------+
6 rows in set (0.05 sec)
You can see that six user records exist in the user table. (Note that neither field names nor table names contain any spaces; they must be typed as one word, just as variables in PHP are.)
You can retrieve multiple fields by separating them with commas:
mysql> SELECT userid, username FROM user;
+--------+-----------------+
| userid | username |
+------- |-----------------+
| Nicrot | Nic Malan |
| Spargy | Andrew Sparg |
| Pads | Brian Reid |
| Dodge | Dave Mercer |
| Mac | Murray McCallum |
| Greeny | Mark Greenfield |
+--------+-----------------+
6 rows in set (0.00 sec)
And you can look at all the fields in the table by specifying *, as in:
mysql> SELECT * FROM user;
This is equivalent to specifying every field in the table. However, when writing applications, it's worth avoiding this notation, even when you want to get all the fields from a table. If you later alter the structure of the table by adding new fields, or changing the order in which they are arranged, your applications might find that the result sets they retrieve are not the ones they were expecting, with unpredictable results. Let's look at how you can use SQL to specify your result set more clearly.
Limiting the Number of Results Returned
You can narrow the scope of retrieved data by using LIMIT and WHERE clauses. Perhaps you want the MySQL server to retrieve just a few rows of records rather than spewing them all out. You can use a LIMIT clause to tell it to do this. For example, if you want to fetch the first two rows of data matched, counting from record 0, you could say:
mysql> SELECT userid, username FROM user LIMIT 0, 2;
+--------+--------------+
| userid | username |
+--------+--------------+
| Nicrot | Nic Malan |
| Spargy | Andrew Sparg |
+--------+--------------+
2 rows in set (0.00 sec)
If the starting point is not specified, zero is assumed, so the following query achieves the same goal:
mysql> SELECT userid, username FROM user LIMIT 2;
The LIMIT clause always comes at the end of the query.
A WHERE clause is used to selectively retrieve rows of data according to specified conditions. If you only want to consider players in the Mid position, for example, you could say:
mysql> SELECT userid, username FROM user
-> WHERE userposition = 'Mid';
+--------+--------------+
| userid | username |
+--------+--------------+
| Nicrot | Nic Malan |
| Spargy | Andrew Sparg |
+--------+--------------+
2 rows in set (0.00 sec)
An equals sign is used to specify the condition (the mid position, in this example). Apart from the equality operator, which is equivalent to PHP's == operator, comparison operators in WHERE clauses Plook much the same as those in PHP. The following table shows the MySQL comparison operators:
| Comparison Operator | Description |
|---|---|
| = | equal to |
| != or <> | not equal to |
| < | less than |
| > | greater than |
| <= | less than or equal to |
| >= | greater than or equal to |
There is also a null-safe comparison operator (<=>) that enables you to use values that may or may not be NULL. For example:
mysql> Select 1 <=> NULL;
+------------+
| 1 = NULL |
+------------+
| NULL |
+------------+
Multiple conditions can be specified using the logical operators AND and OR:
mysql> SELECT usernumber, userid, userposition FROM user
-> WHERE userposition = 'Utility Back'
-> OR userposition = 'Mid'
-> AND usernumber < 5;
+------------+--------+---------------+
| usernumber | userid | userposition |
+------------+--------+---------------+
| 2 | Nicrot | Mid |
| 3 | Spargy | Mid |
| 6 | Greeny | Utility back |
+------------+--------+------------—--+
3 rows in set (0.01 sec)
Wait a minute; something's wrong here. You only wanted to list rows of data whose usernumber field values are less than 5, but the last usernumber is 6. Why is this? It's very simple—the AND operator has only been applied to the userposition='Mid ' condition. When the server retrieves users who are utility backs, it doesn't get as far as the last condition, so the trailing AND operator is useless. To group conditions and force the precedence of these comparisons, simply use parentheses as you would in PHP:
mysql> SELECT usernumber, userid, userposition FROM user
-> WHERE (userposition = 'Utility Back'
-> OR userposition = 'Mid')
-> AND usernumber < 5;
+------------+--------+--------------+
| usernumber | userid | userposition |
+ -----------+--------+--------------+
| 2 | Nicrot | Mid |
| 3 | Spargy | Mid |
+ -----------+--------+--------------+
2 rows in set (0.47 sec)
Ordering the Results
If you need the retrieved rows of data to be sorted on a particular field, you can add an ORDER BY clause:
mysql> SELECT usernumber, userid, userposition FROM user
-> WHERE usernumber < 5 ORDER BY userid;
+ -----------+ -------+--------------+
| usernumber | userid | userposition |
+------------+--------+--------------+
| 4 | Dodge | Link |
| 2 | Nicrot | Mid |
| 1 | Pads | Winger |
| 3 | Spargy |Mid |
+------------+--------+--------------+
4 rows in set (0.05 sec)
When sorting character fields, MySQL orders them according to ASCII value, so that uppercase letters are all placed before any lowercase letters.
An ORDER BY clause sorts retrieved values in ascending order by default, so the preceding ORDER BY clause sorts the values in ascending order by userid.
To sort the values in descending order, use the DESC keyword:
mysql> SELECT userid FROM user ORDER BY userid DESC; +--------+ | userid | +--------+ | Spargy | | Pads | | Nicrot | | Mac | | Greeny | | Dodge | +--------+ 6 rows in set (0.03 sec)
You can sort the retrieved rows on multiple fields by separating the field names with commas. The following example sorts userposition (in ascending order) and userid (in descending order):
mysql> SELECT userid, username, userposition FROM user
-> WHERE userposition = 'Winger'
-> OR userposition = 'Mid'
-> ORDER BY userposition, userid DESC;
+--------+-----------------+--------------+
| userid | username | userposition |
+--------+-----------------+--------------+
| Spargy | Andrew Sparg | Mid |
| Nicrot | Nic Malan | Mid |
| Pads | Brian Reid | Winger |
| Mac | Murray McCallum | Winger |
+--------+-----------------+--------------+
4 rows in set (0.00 sec)
Bear in mind that ORDER BY works much better on an indexed field because indexed fields are already sorted.
Pattern Matching
With the LIKE or NOT LIKE operators, you can specify patterns to match against, using the following special characters:
- % matches any possible string, including empty ones (like * in shell or DOS commands)
- _ matches any single character (like ? in shell or DOS commands)
mysql> SELECT userid, username, userposition, useremail FROM user
-> WHERE useremail LIKE '%doggiesrugby.co.za'
-> ORDER BY username;
+--------+-----------------+--------------+-------------------------------+
| userid | username | userposition | useremail |
+--------+-----------------+--------------+-------------------------------+
| Pads | Brian Reid | Winger | StickyPads@doggiesrugby.co.za |
| Mac | Murray McCallum | Winger | murray@doggiesrugby.co.za |
+--------+-----------------+--------------+-------------------------------+
2 rows in set (0.49 sec)
Values in TEXT type fields are case-insensitive. If you want them to be compared case-sensitively, you need to use a BLOB type instead.
The pattern__c matches the userid Mac as two_wild cards will only match against exactly two characters. No matter how many_wildcards you add, mysql tries to find entries with the exact number of characters specified:
mysql> SELECT userid FROM user WHERE userid LIKE '__c'; +--------+ | userid + |--------+ | Mac | +------+ 1 row in set (0.00 sec)
The NOT LIKE operator does precisely the opposite. Here any userids that don't end in the letter y are selected:
mysql> SELECT userid FROM user
-> WHERE userid NOT LIKE '%y';
+--------+
| userid |
+--------+
| Dodge |
| Mac |
| Nicrot |
| Pads |
+--------+
4 rows in set (0.00 sec)
It is also possible to use regular expressions in MySQL, but that is beyond the scope of this chapter. If you want to look into them in more detail, please see the documentation at: www.mysql.com/doc/en/Pattern_matching.html.
Getting Summaries
Now let's look at how a MySQL server reports summaries on data in a table. MySQL has a number of aggregate functions that summarize the results of a query instead of retrieving rows:
- sum() reports the total of a given field.
- max() reports the largest number in the given field.
- min() reports the smallest number in the given field.
- avg() reports the average of the given field.
- count() reports the number of rows returned.
For instance, you can retrieve both the minimum and maximum values on integer fields by using the min() and max() functions:
mysql> SELECT min(usernumber), max(usernumber) FROM user;
+-----------------+-----------------+
+ min(usernumber) | max(usernumber) |
+-----------------+-----------------+
| 1 6 |
+-----------------+-----------------+
1 row in set (0.00 sec)
The number of rows returned can be obtained by using count(). You can apply it in two ways:
- count(fieldname) counts only rows where the value of the specified field isn't NULL.
- count (*) counts every row in the result set.
The following query uses count(*) to show the number of users with usernmuber less than 3:
mysql> SELECT count(*) FROM user WHERE usernumber < 3;
+----------+
| count(*) |
+----------+
| 2 |
+----------+
1 row in set (0.00 sec)
More Complex Retrievals
Of course, what we have seen so far is not the end of the story. There are a host of keywords and queries that you can run against MySQL to meet every need. Let's say, for example, that you want a list of the players who have accessed pages ending in .html in an access_log table A simple list of the pages visited would give you this information but would probably contain a lot of duplicate entries—especially in a larger database. What you really want is a list of distinct values—in other words, no duplicate values. It'll come as no surprise to find that SQL offers the keyword DISTINCT to specify such a list. You can use it like this:
mysql> SELECT DISTINCT userid FROM access_log WHERE page LIKE '%.html'
-> ORDER BY userid;
+--------+
| userid |
+--------+
| Brian |
| Greeny |
| Nicrot |
| Pads |
+--------+
4 rows in set (0.00 sec)
You may want to apply aggregate functions (min(), max(), count() and so on) to specific groups of records rather than to the database as a whole. For example, you might want to count how many of your users play in a specific position. To do so, you can group the rows returned by means of a GROUP BY clause:
mysql> SELECT userposition, count(*) FROM user GROUP BY userposition
-> ORDER BY userposition DESC;
+--------------+---------+
| userposition | count(*) |
+--------------+----------+
| Winger | 2 |
| Utility back | 1 |
| Mid | 2 |
| Link | 1 |
+--------------+----------+
4 rows in set (0.00 sec)
If you mix field names and aggregate functions in a single query without using a GROUP BY clause, you're bound to see an error:
mysql> SELECT userposition, count(*) FROM user ORDER BY userposition;ERROR 1140: Mixing of GROUP columns (MIN(),MAX(),COUNT()...) with no GROUP
columns is illegal if there is no GROUP BY clause
Remember that an ORDER BY clause goes at the end of a query (unless you LIMIT the query, which must go at the end) and a GROUP BY clause should immediately precede it.
Now let's say that you want to sort your last table by the values in the second column—that is, sort its rows by the user counts returned against each position. You can't refer to count(*) in an ORDER BY clause, so what do you do? MySQL provides a solution: aliases. An alias is effectively just a new name for an expression, a field, or even a table. When you specify a field (or expression) in the SELECT statement, you can append a term like AS alias_name to it.
The result of count(*) now shows up in the result set as a field called num_users. What's more, you can treat the alias name as a normal field and use it in an ORDER BY clause:
mysql> SELECT userposition, count(*) AS num_users FROM user
-> GROUP BY userposition ORDER BY num_users;
+--------------+-----------+
| userposition | num_users |
+--------------+-----------+
| Link | 1 |
| Utility back | 1 |
| Mid | 2 |
| Winger | 2 |
+--------------+-----------+
4 rows in set (0.00 sec)
What about listing the positions for which you have more than one player? You might say this:
mysql> SELECT userposition, count(*) AS num_users FROM user
-> WHERE num_users > 1 GROUP BY userposition;
ERROR 1054: Unknown column 'num_users' in 'where clause'
But there's a problem-num_users can't be defined until you've specified how to group the records, but you need to specify the WHERE clause before the GROUP BY clause or you'll just see a syntax error. The solution is to use a HAVING clause, which enables you to specify a condition on a column or alias at the end of the SELECT statement. Use it in just the same way as WHERE, except that you place it immediately after the GROUP BY clause. The server executes the query as it did before, but before returning the result set, it filters out the results that don't meet the specified condition:
mysql> SELECT userposition, count(*) AS num_users FROM user
-> GROUP BY userposition HAVING num_users > 1;
+---------------+-----------+
| userposition | num_users |
+---------------+-----------+
| Mid | 2 |
| Winger | 2 |
+---------------+-----------+
2 rows in set (0.00 sec)
How would you pull data from both of your tables? Just specify them both in the FROM clause, like this:
mysql> SELECT username, page FROM user, access_l +-----------------+-----------------+ | username | page | +-----------------+-----------------+ | Nic Malan | /score.html | | Andrew Sparg | /score.html | | Brian Reid | /score.html | | Dave Mercer | /score.html | | Murray McCallum | /score.html | | Mark Greenfield | /score.html | | Nic Malan | /stats.html | | Andrew Sparg | /stats.html | | Brian Reid | /stats.html | | Dave Mercer | /stats.html | ... ... | Mark Greenfield | /log.html | | Nic Malan | /penalties.html | | Andrew Sparg | /penalties.html | | Brian Reid | /penalties.html | | Dave Mercer | /penalties.html | | Murray McCallum | /penalties.html | | Mark Greenfield | /penalties.html | +-----------------+-----------------+ 60 rows in set (0.00 sec)
You haven't defined a relationship between the two fields, so this query returns every possible combination of values. However, the MySQL server is smart enough to know which field belongs to which table. What if you use a field name that's in both tables?
mysql> SELECT userid, page from user, access_log;
ERROR 1052: Column: 'userid' in field list is ambiguous
Well, the error message says it all. In this example, the use of the userid field is ambiguous because you could be specifying either of two separate fields. You know that they refer to the same data, but as far as the server is concerned, the fields are completely independent of one another. You can resolve the ambiguity by explicitly specifying which table from which you want to take the field values, using the following notation:
mysql> SELECT user.userid, page FROM user, access_log; +--------+-----------------+ | userid | page | +--------+-----------------+ | Dodge | /score.html | | Greeny | /score.html | | Mac | /score.html | | Nicrot | /score.html | | Pads | /score.html | | Spargy | /score.html | ... ... | Spargy | /log.html | | Dodge | /penalties.html | | Greeny | /penalties.html | | Mac | /penalties.html | | Nicrot | /penalties.html | | Pads | /penalties.html | | Spargy | /penalties.html | +--------+-----------------+ 60 rows in set (0.01 sec)
Every entry in the access log table should correspond to one in the user table—that's the way the database is designed to work. It makes sense really: You're only logging page accesses for registered users, so anyone showing up in the access log must be a registered user. In this case, the userid field is common to both tables. Say, for example, that you want to look at who's been viewing which pages. Now that you can make a distinction between the two fields called userid, you can equate them in a WHERE clause, and narrow the result set as required:
mysql> SELECT user.userid, page FROM user, access_log
-> WHERE user.userid = access_log.userid;
+--------+-----------------+
| userid | page |
+--------+-----------------+
| Dodge | /score.html |
| Dodge | /stats.html |
| Greeny | /log.html |
| Greeny | /penalties.html |
| Mac | /score.html |
| Nicrot | /log.html |
| Nicrot | /refs.html |
| Pads |/score.html |
+--------+-----------------+
8 rows in set (0.00 sec)
You've just witnessed the power of relational databases. The connection you just made to relate data records in multiple tables is called a join—you're producing a result set by joining the records from one table to those in another. In your last query, the userid field served as a key to join the user and access_log tables; that is, you used userid to establish explicit relationships between the records in the two tables. Keys are essential in join operations because they are related across tables, removing the need for database designers having to repeat that data in every table.
There are many more advanced join operations than the one you've seen here, but this one is more than adequate for most situations. You may want to read some more advanced MySQL references for further information on additional complex queries and join operations. However, one rule of thumb applies no matter what level you're aiming for in building your SQL muscle: practice makes perfect. Use your mysql client to experiment with queries with a variety of field types—ultimately, that's the only way to really appreciate the possibilities and intricacies of the SQL syntax.
