Search This Blog

Monday, May 17, 2010

Random Access to File Data

As you may have realized, it'd be more efficient if you opened the file for both read and write operations with a single fopen() call. However, using the functions you've met so far, you can only manipulate data sequentially, that is, in the same order that it is (or will be) arranged in the file. This puts a major limitation on what you can usefully achieve by doing it this way; once the file position indicator passes a certain point in a file, you need to close and reopen the file before you can access data at that point. Unless your data access is terribly well organized in advance (not likely in real-life situations, where it's often impossible to predict what you might need to do), you'd end up opening your data file as many (if not more) times as before.
What you really need is some way to move the file position indicator around in the file without having to close and reopen the file. There are a couple of functions that do just that:
  • fseek()
  • ftell()
PHP provides a number of functions that are designed to let you read from or write to specific positions within a file. Specifying a file handle (such as $fp) and an integer offset (5 in the following example) as arguments, fseek() will move the file position indicator associated with fp to a position determined by offset. By default, the offset is measured in bytes from the beginning of the file. Here's an example:


fseek($fp, 5);
$one_char = fgetc($fp);

This code places the file's file position indicator associated with handle $fp just after the fifth byte in that file. The call to fgetc() therefore returns the contents of the sixth byte. A third optional argument (called whence in the documentation) can be specified with any of the following values to calculate the relative offset:
  • SEEK_SET: The beginning of the file + offset.
  • SEEK_CUR: Current position + offset (default).
  • SEEK_END: End of the file + offset.
fseek() is rather unusual because it's an integer PHP function that returns 0, not 1, upon success (it also returns -1 upon failure). You can't use this function with files on remote hosts opened through either an HTTP, URL, or FTP.


The ftell() function takes a file handle and returns the current offset (in bytes) of the corresponding file position indicator. For example:


$fpi_offset = ftell($fp);
rewind()

This is similar to the rewind button on your cassette player—it takes a file handle and resets the corresponding file position indicator to the beginning of the file. You can say:


rewind($fp);
which is functionally equivalent to:
fseek($fp, 0);

As you saw earlier, the fpassthru() function outputs file data from the current file position onward. If you have already read data from a file but want to echo the file's entire contents, you need to call rewind() first.


You can use rewind() to revise the counter script so that it only has to open the data file once, for both reading and writing:


<?php
//hit_counter09.php
$counter_file = "./count.dat";
if(!($fp = fopen($counter_file, "r+"))) die ("Cannot open $counter_file.");
$counter = (int) fread($fp, 20);
$counter++;

echo "You're visitor No. $counter.";
rewind($fp);
fwrite($fp, $counter);
fclose($fp);
?>

As you sec, the data file is only opened once, in read and write mode. After reading the last access number from the file and displaying it, you rewind the file to reset the file position indicator.


Try it Out: Navigate Within a File
Start example
Here's another example that uses these three navigating functions (fseek(), ftell(), and rewind()):


<?php
//nav_file.php
$name_field_len = 15;
$country_code_field_len = 2;
$country_field_len = 20;
$email_field_len = 30;

if(!($fp = fopen("./address.dat", "r")))
      die ("Cannot open the address data file.");
do {
   $address = '';
   $.field = fread($fp, $name_field_len);
   $address .= $field;
   $field = fread($fp, $country_code_field_len);
   $address .= $field;
   $field = fread($fp, $country_field_len);
   $address .= $field;
   $field = fread($fp, $email_field_len);
   $address .= $field;
   echo "$address<BR>";
} while ($field);

rewind($fp);

echo "<BR>";

fseek($fp, $name_field_len);

do {
   $country_code = fread($fp, $country_code_field_len)
   fseek($fp, ftell($fp) + $country_feild_len +
                           $email_field_len +
                           $name_field_len + 1);
   //NB: change '+1' to '+2' on Win32 platforms
   echo ''$country_code<BR>";
} while ($country_code) ;

fclose($fp);
?>

This script assumes you have an address book data file called address.dat in the current directory that looks like this (note that you must be very careful about the number of characters—even blank characters—in each field):


Wankyu Choi      KRRepublic of Korea   wankyu@whatyoumaycallit.com
James Hetfield   USUnited States       james@headbangers.com
Nomura Sensei    JPJapan               nomura@nosuchsite.com

Here's the output from a test run:


Wankyu Choi KRRepublic of Korea wankyu@whatyoumaycallit.com
James Hetfield USUnited States james@headbangers.com
Nomura Sensei JPJapan nomura@nosuchsite.com

KR
US
JP

Records in this data file are separated by a newline character. (As mentioned earlier, a newline character is CR LF for Windows platforms, CR for Mac, and LF for Linux.) Each field has a set length: 15 characters for the name field, two characters for the country code field, and so on.


End example

How it Works

The example starts by opening address.dat for reading in the current directory. First, it displays all the records as they are. When fread() reaches end-of-file, $field is set to False and the first loop terminates:


if(!($fp = fopen("./address.dat", "r")))
                die ("Cannot open the address data file.");
do {
   $address = ' ' ;
   $field = fread($fp, $name_field_len);
   $address .= $field;
   $field = fread($fp, $country_code_field_len);
   $address .= $field;
   $field = fread($fp, $country_field_len);
   $address .= $field;
   $field = fread($fp, $email_field_len);
   $address .= $field;
   echo "$address<BR>";
} while ($field);

Then the data file rewinds, and the file position indicator moves to the end of the first name field entry, so that you're ready to read and display the country code field values:


rewind($fp);
fseek($fp, $name_field_len);

A do...while loop is initiated. Within it, you assign the country code data to a variable with fread(), move the file position indicator to the start of the next country code field, and finally output the country code that was just read. You start by assigning a value to $country_code:


do {
   $country_code = fread($fp, $country_code_field_len);

Determine the exact position of the next country code field as follows:
  • Get the current position of the file position indicator, as returned by ftell($fp).
  • Add to this the total length of the remaining fields and a trailing newline character.
By using this as the second argument in a call to fseek(), you set the file position indicator to the appropriate point in the file:


fseek($fp, ftell($fp) + $country_field_len + $email_field_len +
                                                  $name_field_len + 1);

For Windows platforms, the length of a CRLF combination should be given as 2 instead of 1 (1 is fine for Linux and Mac). You echo the recorded value and close the loop, which cycles as long as $country_code is True:


      echo "$country_code<BR>";
} while($country_code);

Finally, the file is closed:


fclose($fp);