DevOps
Linux
Text Processing

Text Processing

In Linux, "Everything is a file." Being able to view, search, and manipulate text data is a core skill for log analysis and configuration management.

Viewing Content

  • cat: Print the entire file to the terminal (best for small files).
  • less: View file content screen-by-screen (navigation: space for next page, q to quit).
  • head -n 10: View the first 10 lines of a file.
  • tail -n 10: View the last 10 lines of a file.
  • tail -f: Follow the file in real-time (perfect for watching logs as they happen).

Searching with grep

grep (Global Regular Expression Print) is the ultimate tool for searching patterns within text.

# Search for 'error' in a log file
grep "error" application.log
 
# Case-insensitive search
grep -i "error" application.log
 
# Search recursively in a directory
grep -r "API_KEY" ./src

Data Manipulation

  • wc -l: Count the number of lines in a file.
  • sort: Sort lines alphabetically or numerically (-n).
  • uniq: Filter out adjacent duplicate lines (often used after sort).
  • awk: A powerful language for pattern scanning and processing (e.g., awk '{print $1}' prints the first column of each line).
# Find unique IP addresses in an access log
cat access.log | awk '{print $1}' | sort | uniq -c

[!IMPORTANT] Piping (|) The pipe character (|) takes the output of one command and gives it as input to another. This is the secret to Linux "one-liners."