= Filters - find, grep, sed and awk =

== find ==

The find command in UNIX is a command line utility for walking a file hierarchy. It can be used to find files and directories and perform subsequent operations on them. It supports searching by file, folder, name, creation date, modification date, owner and permissions. By using the ‘-exec’ other UNIX commands can be executed on files or folders found.

`$ find [where to start searching from]`

1. Search a file with specific name.

`$ find ./folder -name sample.txt `

2. Search a file with pattern.

`$ find ./folder -name *.txt `

It will give all files which have ‘.txt’ at the end. 

3. How to find and delete a file with confirmation.

`$ find ./folder -name sample.txt -exec rm -i {} \; `

4. Search for empty files and directories.

`$ find ./folder -empty`

5. Search for file with entered permissions.

`$ find ./folder -perm 664`

6. Search text within multiple files.

`$ find ./ -type f -name "*.txt" -exec grep 'Geek'  {} \;`


== grep ==

The grep filter searches a file for a particular pattern of characters, and displays all lines that contain that pattern. The pattern that is searched in the file is referred to as the regular expression (grep stands for global search for regular expression and print out). 
Syntax: 
 
grep [options] pattern [files]

1. Case insensitive search : The -i option enables to search for a string case insensitively in the given file. It matches the words like “UNIX”, “Unix”, “unix”. 
 
`$grep -i "UNix" file.txt`

2. Displaying the count of number of matches : We can find the number of lines that matches the given string/pattern 

`$grep -c "unix" file.txt`

3. Display the file names that matches the pattern : We can just display the files that contains the given string/pattern. 
 
`$grep -l "unix" *`

or
 
`$grep -l "unix" f1.txt f2.txt f3.xt f4.txt`

4. Checking for the whole words in a file : By default, grep matches the given string/pattern even if it is found as a substring in a file. The -w option to grep makes it match only the whole words. 
 
`$ grep -w "unix" file.txt`

5. Displaying only the matched pattern : By default, grep displays the entire line which has the matched string. We can make the grep to display only the matched string by using the -o option. 
 
`$ grep -o "unix" file.txt`

6. Show line number while displaying the output using grep -n : To show the line number of file with the line matched. 
 
`$ grep -n "unix" file.txt`

7. Inverting the pattern match : You can display the lines that are not matched with the specified search string pattern using the -v option. 
 
`$ grep -v "unix" file.txt`

8. Matching the lines that start with a string : The ^ regular expression pattern specifies the start of a line. This can be used in grep to match the lines which start with the given string or pattern. 
 
`$ grep "^unix" file.txt`

9. Matching the lines that end with a string : The $ regular expression pattern specifies the end of a line. This can be used in grep to match the lines which end with the given string or pattern. 
 
`$ grep "os$" file.txt`

10.Specifies expression with -e option. Can use multiple times : 
 
`$grep –e "Agarwal" –e "Aggarwal" –e "Agrawal" file.txt`

12. Search recursively for a pattern in the directory: -R prints the searched pattern in the given directory recursively in all the files.

Syntax

`$grep -R [Search] [directory]`

== sed ==

SED is a text stream editor used on Unix systems to edit files quickly and efficiently. The tool searches through, replaces, adds, and deletes lines in a text file without opening the file in a text editor. 

- SED is a powerful text stream editor. Can do insertion, deletion, search and replace(substitution).
- SED command in unix supports regular expression which allows it perform complex pattern matching.

1.Replacing or substituting string : Sed command is mostly used to replace the text in a file. The below simple sed command replaces the word “unix” with “linux” in the file.

`$sed 's/unix/linux/' test.txt`

2. Replacing the nth occurrence of a pattern in a line : Use the /1, /2 etc flags to replace the first, second occurrence of a pattern in a line. The below command replaces the second occurrence of the word “unix” with “linux” in a line.

`$sed 's/unix/linux/2' test.txt`

3.Replacing all the occurrence of the pattern in a line : The substitute flag /g (global replacement) specifies the sed command to replace all the occurrences of the string in the line.

`$sed 's/unix/linux/g' test.txt`

4.Replacing from nth occurrence to all occurrences in a line : Use the combination of /1, /2 etc and /g to replace all the patterns from the nth occurrence of a pattern in a line. The following sed command replaces the third, fourth, fifth… “unix” word with “linux” word in a line.

`$sed 's/unix/linux/3g' test.txt`

5.Parenthesize first character of each word : This sed example prints the first character of every word in parenthesis.

`$ echo "Welcome To The Test Stuff" | sed 's/\(\b[A-Z]\)/\(\1\)/g'`

6.Replacing string on a specific line number : You can restrict the sed command to replace the string on a specific line number. An example is

`$sed '3 s/unix/linux/' tezt.txt`

7.Duplicating the replaced line with /p flag : The /p print flag prints the replaced line twice on the terminal. If a line does not have the search pattern and is not replaced, then the /p prints that line only once.

`$sed 's/unix/linux/p' test.txt`

8.Printing only the replaced lines : Use the -n option along with the /p print flag to display only the replaced lines. Here the -n option suppresses the duplicate rows generated by the /p flag and prints the replaced lines only one time.

`$sed -n 's/unix/linux/p' test.txt`

9.Replacing string on a range of lines : You can specify a range of line numbers to the sed command for replacing a string.

`$sed '1,3 s/unix/linux/' test.txt`

10.Deleting lines from a particular file : SED command can also be used for deleting lines from a particular file. SED command is used for performing deletion operation without even opening the file
Examples:
 10.1.To Delete a particular line say n in this example
Syntax:
`$ sed 'nd' filename.txt`

Example:
`$ sed '5d' filename.txt`

10.2. To Delete a last line

Syntax:

`$ sed '$d' filename.txt`

10.3. To Delete line from range x to y

Syntax:

`$ sed 'x,yd' filename.txt`

Example:

`$ sed '3,6d' filename.txt`

10.4. To Delete from nth to last line

Syntax:

`$ sed 'nth,$d' filename.txt`

Example:

`$ sed '12,$d' filename.txt`

10.5. To Delete pattern matching line

Syntax:
`$ sed '/pattern/d' filename.txt`

Example:

`$ sed '/abc/d' filename.txt`

== awk ==

Awk is a scripting language used for manipulating data and generating reports. The awk command programming language requires no compiling and allows the user to use variables, numeric functions, string functions, and logical operators. 

Syntax:

`awk options 'selection _criteria {action }' input-file > output-file`

1. Default behavior of Awk: By default Awk prints every line of data from the specified file.  

`$ awk '{print}' employee.txt`

2. Print the lines which match the given pattern. 

`$ awk '/manager/ {print}' employee.txt `

3. Splitting a Line Into Fields : For each record i.e line, the awk command splits the record delimited by whitespace character by default and stores it in the $n variables. If the line has 4 words, it will be stored in $1, $2, $3 and $4 respectively. Also, $0 represents the whole line.  

`$ awk '{print $1,$4}' employee.txt`