Tuesday, September 2, 2014

Unix-Searching the contents of a file

Searching the contents of a file

Simple searching using less

Using less, you can search though a text file for a keyword (pattern). For example, to search through science.txt for the word 'science', type
% less science.txt
then, still in less, type a forward slash [/] followed by the word to search
/science
As you can see, less finds and highlights the keyword. Type [n] to search for the next occurrence of the word.

grep (don't ask why it is called grep)

grep is one of many standard UNIX utilities. It searches files for specified words or patterns. First clear the screen, then type
% grep science science.txt
As you can see, grep has printed out each line containg the word science.
Or has it ????
Try typing
% grep Science science.txt
The grep command is case sensitive; it distinguishes between Science and science.
To ignore upper/lower case distinctions, use the -i option, i.e. type
% grep -i science science.txt
To search for a phrase or pattern, you must enclose it in single quotes (the apostrophe symbol). For example to search for spinning top, type
% grep -i 'spinning top' science.txt
Some of the other options of grep are:
-v display those lines that do NOT match
-n precede each matching line with the line number
-c print only the total count of matched lines 
Try some of them and see the different results. Don't forget, you can use more than one option at a time. For example, the number of lines without the words science or Science is
% grep -ivc science science.txt


The grep is one of the powerful tools in unix. Grep stands for “global search for regular expressions and print”. The power of grep lies in using regular expressions mostly.
The general syntax of grep command is
grep [options] pattern [files]
1. Write a command to print the lines that has the the pattern “july” in all the files in a particular directory?
grep july *
This will print all the lines in all files that contain the word “july” along with the file name. If any of the files contain words like “JULY” or “July”, the above command would not print those lines.
2. Write a command to print the lines that has the word “july” in all the files in a directory and also suppress the filename in the output.
grep -h july *
3. Write a command to print the lines that has the word “july” while ignoring the case.
grep -i july *
The option i make the grep command to treat the pattern as case insensitive.
4. When you use a single file as input to the grep command to search for a pattern, it won’t print the filename in the output. Now write a grep command to print the filename in the output without using the ‘-H’ option.
grep pattern filename /dev/null
The /dev/null or null device is special file that discards the data written to it. So, the /dev/null is always an empty file.
Another way to print the filename is using the ‘-H’ option. The grep command for this is
grep -H pattern filename
5. Write a Unix command to display the lines in a file that do not contain the word “july”?
grep -v july filename
The ‘-v’ option tells the grep to print the lines that do not contain the specified pattern.
6. Write a command to print the file names in a directory that has the word “july”?
grep -l july *
The ‘-l’ option make the grep command to print only the filename without printing the content of the file. As soon as the grep command finds the pattern in a file, it prints the pattern and stops searching other lines in the file.
7. Write a command to print the file names in a directory that does not contain the word “july”?
grep -L july *
The ‘-L’ option makes the grep command to print the filenames that do not contain the specified pattern.
8. Write a command to print the line numbers along with the line that has the word “july”?
grep -n july filename
The ‘-n’ option is used to print the line numbers in a file. The line numbers start from 1
9. Write a command to print the lines that starts with the word “start”?
grep ‘^start’ filename
The ‘^’ symbol specifies the grep command to search for the pattern at the start of the line.
10. Write a command to print the lines which end with the word “end”?
grep ‘end$’ filename
The ‘$’ symbol specifies the grep command to search for the pattern at the end of the line.
11. Write a command to select only those lines containing “july” as a whole word?
grep -w july filename
The ‘-w’ option makes the grep command to search for exact whole words. If the specified pattern is found in a string, then it is not considered as a whole word. For example: In the string “mikejulymak”, the pattern “july” is found. However “july” is not a whole word in that string.

wc (word count)

A handy little utility is the wc command, short for word count. To do a word count on science.txt, type
% wc -w science.txt
To find out how many lines the file has, type
% wc -l science.txt

No comments: