Reading word by word from a file in PHP
By: David Sklar
You want to do something with every word in a file. Read in each line with fgets(), separate the line into words, and process each word:
$fh = fopen('great-american-novel.txt','r') or die($php_errormsg); while (! feof($fh)) { if ($s = fgets($fh,1048576)) { $words = preg_split('/\s+/',$s,-1,PREG_SPLIT_NO_EMPTY); // process words } } fclose($fh) or die($php_errormsg);
Here's how to calculate average word length in a file:
$word_count = $word_length = 0; if ($fh = fopen('great-american-novel.txt','r')) { while (! feof($fh)) { if ($s = fgets($fh,1048576)) { $words = preg_split('/\s+/',$s,-1,PREG_SPLIT_NO_EMPTY); foreach ($words as $word) { $word_count++; $word_length += strlen($word); } } } } print sprintf("The average word length over %d words is %.02f characters.", $word_count, $word_length/$word_count);
Processing every word proceeds differently depending on how "word" is defined. The code in this recipe uses the Perl-compatible regular-expression engine's \s whitespace metacharacter, which includes space, tab, newline, carriage return, and formfeed. Code sample above breaks apart a line into words by splitting on a space, which is useful in that recipe because the words have to be rejoined with spaces. The Perl-compatible engine also has a word-boundary assertion (\b) that matches between a word character (alphanumeric) and a nonword character (anything else). Using \b instead of \s to delimit words most noticeably treats differently words with embedded punctuation. The term 6 o'clock is two words when split by whitespace (6 and o'clock); it's four words when split by word boundaries (6, o, ', and clock).
Archived Comments
1. Nice !!
View Tutorial By: Sumit Raj at 2010-07-09 03:13:26
2. Nice !!
View Tutorial By: Sumit Raj at 2010-07-09 03:13:07
3. Hi
Thankyou for your code. Its good.
View Tutorial By: Rajeshkumar at 2010-01-29 22:34:17
Comment on this tutorial
- Data Science
- Android
- AJAX
- ASP.net
- C
- C++
- C#
- Cocoa
- Cloud Computing
- HTML5
- Java
- Javascript
- JSF
- JSP
- J2ME
- Java Beans
- EJB
- JDBC
- Linux
- Mac OS X
- iPhone
- MySQL
- Office 365
- Perl
- PHP
- Python
- Ruby
- VB.net
- Hibernate
- Struts
- SAP
- Trends
- Tech Reviews
- WebServices
- XML
- Certification
- Interview
categories
Related Tutorials
PHP convert string to lower case
PHP code to write to a CSV file for Microsoft Applications
PHP code to write to a CSV file from MySQL query
PHP code to import from CSV file to MySQL
Password must include both numeric and alphabetic characters - Magento
Error: Length parameter must be greater than 0
PHP file upload prompts authentication for anonymous users
PHP file upload with IIS on windows XP/2000 etc
Multiple File Upload in PHP using IFRAME
Resume or Pause File Uploads in PHP
Exception in module wampmanager.exe at 000F15A0 in Windows 8