A C program similar to grep command in UNIX

By: Abinaya Emailed: 1645 times Printed: 2117 times    

Latest comments
By: rohit kumar - how this program is work
By: Kirti - Hi..thx for the hadoop in
By: Spijker - I have altered the code a
By: ali mohammed - why we use the java in ne
By: ali mohammed - why we use the java in ne
By: mizhelle - when I exported the data
By: raul - no output as well, i'm ge
By: Rajesh - thanx very much...
By: Suindu De - Suppose we are executing

Let us design and write a program to print each line of its input that contains a particular ``pattern'' or string of characters. (This is a special case of the UNIX program grep.) For example, searching for the pattern of letters ``ould'' in the set of lines
   Ah Love! could you and I with Fate conspire
   To grasp this sorry Scheme of Things entire,
   Would not we shatter it to bits -- and then
   Re-mould it nearer to the Heart's Desire!
will produce the output
   Ah Love! could you and I with Fate conspire
   Would not we shatter it to bits -- and then
   Re-mould it nearer to the Heart's Desire!
The job falls neatly into three pieces:
while (there's another line)
    if (the line contains the pattern)
        print it
Although it's certainly possible to put the code for all of this in main, a better way is to use the structure to advantage by making each part a separate function. Three small pieces are better to deal with than one big one, because irrelevant details can be buried in the functions, and the chance of unwanted interactions is minimized. And the pieces may even be useful in other programs.

``While there's another line'' is getline function, and ``print it'' is printf, which someone has already provided for us. This means we need only write a routine to decide whether the line contains an occurrence of the pattern.

We can solve that problem by writing a function strindex(s,t) that returns the position or index in the string s where the string t begins, or -1 if s does not contain t. Because C arrays begin at position zero, indexes will be zero or positive, and so a negative value like -1 is convenient for signaling failure. When we later need more sophisticated pattern matching, we only have to replace strindex; the rest of the code can remain the same. (The standard library provides a function strstr that is similar to strindex, except that it returns a pointer instead of an index.)

Given this much design, filling in the details of the program is straightforward. Here is the whole thing, so you can see how the pieces fit together. For now, the pattern to be searched for is a literal string, which is not the most general of mechanisms. We will return shortly to a discussion of how to initialize character arrays. There is also a slightly different version of getline;

   #include <stdio.h>
   #define MAXLINE 1000 /* maximum input line length */

   int getline(char line[], int max)
   int strindex(char source[], char searchfor[]);

   char pattern[] = "ould";   /* pattern to search for */

   /* find all lines matching pattern */
   main()
   {
       char line[MAXLINE];
       int found = 0;

       while (getline(line, MAXLINE) > 0)
           if (strindex(line, pattern) >= 0) {
               printf("%s", line);
               found++;
           }
       return found;
   }

   /* getline:  get line into s, return length */
   int getline(char s[], int lim)
   {
       int c, i;

       i = 0;
       while (--lim > 0 && (c=getchar()) != EOF && c != '\n')
           s[i++] = c;
       if (c == '\n')
           s[i++] = c;
       s[i] = '\0';
       return i;
   }

   /* strindex:  return index of t in s, -1 if none */
   int strindex(char s[], char t[])
   {
       int i, j, k;

       for (i = 0; s[i] != '\0'; i++) {
           for (j=i, k=0; t[k]!='\0' && s[j]==t[k]; j++, k++)
               ;
           if (k > 0 && t[k] == '\0')
               return i;
       }
       return -1;
   }
Each function definition has the form
return-type function-name(argument declarations)
{
    declarations and statements
}
Various parts may be absent; a minimal function is
   dummy() {}
which does nothing and returns nothing. A do-nothing function like this is sometimes useful as a place holder during program development. If the return type is omitted, int is assumed.

A program is just a set of definitions of variables and functions. Communication between the functions is by arguments and values returned by the functions, and through external variables. The functions can occur in any order in the source file, and the source program can be split into multiple files, so long as no function is split.

The return statement is the mechanism for returning a value from the called function to its caller. Any expression can follow return:

   return expression;
The expression will be converted to the return type of the function if necessary. Parentheses are often used around the expression, but they are optional.

The calling function is free to ignore the returned value. Furthermore, there need to be no expression after return; in that case, no value is returned to the caller. Control also returns to the caller with no value when execution ``falls off the end'' of the function by reaching the closing right brace. It is not illegal, but probably a sign of trouble, if a function returns a value from one place and no value from another. In any case, if a function fails to return a value, its ``value'' is certain to be garbage.

The pattern-searching program returns a status from main, the number of matches found. This value is available for use by the environment that called the program

The mechanics of how to compile and load a C program that resides on multiple source files vary from one system to the next. On the UNIX system, for example, the cc command does the job. Suppose that the three functions are stored in three files called main.c, getline.c, and strindex.c. Then the command

   cc main.c getline.c strindex.c
compiles the three files, placing the resulting object code in files main.o, getline.o, and strindex.o, then loads them all into an executable file called a.out. If there is an error, say in main.c, the file can be recompiled by itself and the result loaded with the previous object files, with the command
   cc main.c getline.o strindex.o
The cc command uses the ``.c'' versus ``.o'' naming convention to distinguish source files from object files.

C Home | All C Tutorials | Latest C Tutorials

Sponsored Links

If this tutorial doesn't answer your question, or you have a specific question, just ask an expert here. Post your question to get a direct answer.



Bookmark and Share

Comments(1)


1. View Comment

Great!! Many thanks!

View Tutorial          By: Louis at 2013-06-01 23:16:41

Your name (required):


Your email(required, will not be shown to the public):


Your sites URL (optional):


Your comments:



More Tutorials by Abinaya
How to compile a Java program - javac
The clone() Method in Java
Data Types in Java
JSP Example to connect to MS SQL database and retrieve records
faces-config.xml to DirectTraffic in the JSF Application
Enabling Expression Language Evaluation in JSP
Using malloc() Function in C
ActionErrors and ActionError in Struts
Open, Creat, Close, Unlink system calls sample program in C
Structures and Functions in C
Introduction to JSP expression language
Multi-dimensional Arrays in C (Explained using date conversion program)
A C program similar to grep command in UNIX
Type Conversions in C (String to Integer, isdigit() etc)
Basics of C

More Tutorials in C
Sum of the elements of an array in C
Printing a simple histogram in C
Sorting an integer array in C
Find square and square root for a given number in C
Simple arithmetic calculations in C
Command-line arguments in C
Calculator in C
Passing double value to a function in C
Passing pointer to a function in C
Infix to Prefix And Postfix in C
while, do while and for loops in C
Unicode and UTF-8 in C
Formatting with printf in C
if, if...else and switch statements in C with samples
Statements in C

More Latest News
Most Viewed Articles (in C )
Using memset(), memcpy(), and memmove() in C
UNIX read and write system calls sample program in C
Printing a simple histogram in C
lseek() sample program in C
perror() Function - example program in C
Open, Creat, Close, Unlink system calls sample program in C
Find square and square root for a given number in C
Listing Files and Directories sample program in C
Using free() Function in C
goto and labels in C
Character Arrays in C
A C program similar to grep command in UNIX
getch and ungetch in C
File Inclusion in C
Macro Substitution using #define in C
Most Emailed Articles (in C)
Multi-dimensional Arrays in C (Explained using date conversion program)
Arguments - Call by Value in C
Macro Substitution using #define in C
Formatting with printf in C
Unicode and UTF-8 in C
Character Arrays in C
Pointer Arrays and Pointers to Pointers in C
Open, Creat, Close, Unlink system calls sample program in C
lseek() sample program in C
Sum of the elements of an array in C
The for statement in C
Symbolic Constants using #define in C
Initialization of Variables in C
Pointers vs. Multi-dimensional Arrays in C
Using Bit-field in C