Xman Blog Inside: Gawk Quick Reference

By default, a record is a line. A line is made up of fields with default delimiter. Main programs are mainly based on C program syntax.

BEGIN { FS="\t+" }	Initialization. Set field separator to be one or more TABs. Take the string as regular expression if more than 1 characters.
END { ... }	Finalization. Execute ... after all files have been processed.
gawk -F"\t" '{ ... }'	Execute with field separator TAB.
/xyz/ { print }	For each line contain xyz, print the line.
$1 == 100 { print $2, $3 }	For each line with first field equals 100, print the 2nd and 3rd fields separated by space.
$3 ~ /PAT/ { print $2 $3 }	If the 3rd field matches PAT, print the concatenated 2nd and 3rd fields.
$3 !~ /PAT/ { x = 0 }	If the 3rd field doesnt match PAT, let x = 0.
print $1 OFS $2	Print 1st and 2nd fields separated by output field separator.
FS, OFS, RS, ORS	Field & output field separator, record & output record separator.
NF, NR	Number of fields and records.
FILENAME	Current input file.
$NF	Last field
'${val:-hello}'	In matching region, represents value of $val from the BASH environment, but use "hello" if val is not defined.
"'${val}'"	In code region, represents value of $val from the BASH environment.
array[2]="hello"
array["i"]="world"
for(i in array) print array[i]	Also works for multi-dimensional array.
if("i" in array) print "found"	Print if "i" is a subscript of array.
n = split($1, array, ":") for(i = 1; i <= n ; i++) print array[i]	Split 1st field into array with ":" as delimiter.
array[2,5]="val25"	Equivalent to array["2" SUBSEP "5"]="val25". SUBSEP is a subscript-component separator such as "\034" by default.
if((i,j) in array)
ARGC, ARGV, ENVIRON	Number of arguments, argument array, and environment array.
index(s,t)	Position of t in s.
length(s)
sub(r,s[,t])	Substitute first match of r in t by s. t is $0 by default.
match(s,p)	Return starting position of the substring in s that regular expression p matches.
sprintf("fmt",expr)
substr(s,p,n)	The substring of s at position p up to n long.
toupper(s), tolower(s)
function name(list) { statement }
getline	Get next line.
getline <"-"	Get a line from stdin.
print > "out.txt"	Print to the file out.txt.

Reference: UNIX Power Tools: sed & awk by Dale Dougherty

Labels: Gawk, Quick reference

Xman Blog Inside

Wednesday, September 06, 2006

Gawk Quick Reference

0 Comments:

Previous Posts