USD ($)
$
United States Dollar
Euro Member Countries
India Rupee
د.إ
United Arab Emirates dirham
ر.س
Saudi Arabia Riyal

Text Transformation and Reporting

Lesson 30/40 | Study Time: 15 Min

Text transformation and reporting are essential tasks in Linux system administration and data processing. Tools like sed, awk, and grep allow administrators to manipulate streams of text, generate structured reports, and perform complex data extraction through pattern matching. Efficient use of these utilities enables processing of single or multiple files seamlessly for automation, analysis, and auditing purposes.

Stream Editing with sed

sed is a stream editor for filtering and transforming text in-line or from files. Common uses include, substitution, deletion, insertion, and complex pattern-based editing.


1. Syntax for substitution:

bash
sed 's/pattern/replacement/g' inputfile

g flag replaces all occurrences on a line.


2. Deleting lines matching a pattern:

bash
sed '/pattern/d' file.txt


3. Multiple commands with -e:

bash
sed -e 's/old/new/g' -e '/^$/d' file.txt


4. Supports regular expressions for powerful matching.

Report Generation with awk

awk is a domain-specific language designed for text processing and reporting. It splits input lines into fields and executes actions on conditions.


Example: Print the first and third fields of a CSV file:

bash
awk -F',' '{ print $1, $3 }' file.csv


Summation example:

bash
awk '{ sum += $2 } END { print sum }' data.txt


It supports built-in variables, conditional statements, loops, and formatted output (printf).

Complex Pattern Matching with grep

grep searches input for lines matching patterns using regular expressions. Options for case-insensitivity (-i), context lines (-A, -B), counting matches (-c), and recursive search (-r).


Example:

bash
grep -i "error" /var/log/*.log


Useful for quick extraction and filtering of information across files.

Data Extraction and Formatting

Combine tools in pipelines to extract and format data:


Example: Extract users from /etc/passwd and sort:

bash
awk -F: '{ print $1 }' /etc/passwd | sort

Use sed to clean or modify data before reporting.

Multi-file Processing

awk and sed can process multiple files by listing them or using wildcards.


Example with awk:

bash
awk '/pattern/ {print FILENAME ": " $0}' *.log


xargs or shell loops combined with these tools enable bulk data transformations.