Using Zparseopts

tl;dr: Always use the flags -D and -E (and -F if available). Add a colon to indicate an argument is required (similar to getopt)

The problem:

So… you want to write a fancy program, with its fancy command-line arguments which control its fancy behavior.

You want to define fancy flags which take arguments, less fancy flags which don’t, and complain when you find any invalid flags.

Now you could parse those fancy arguments by hand, but this is a common problem, so surely it is a solved problem right?

The candidates:

Meet our candidates: The POSIX-specified getopts, util-linux’s getopt, and Zsh’s zparseopts.

Before we get into it, let’s make some requirements:

  1. Arguments may be mixed in with flags
  2. The flag -a/--arg takes an argument, store in $arg.
  3. The presence of flag -f/--flag should be stored in $flag.
  4. The two flags -F/--foo and -B/--bar override each other, store in $foobar.
  5. -- denotes end-of-options.
  6. If an unknown flag is provided before --, print the invalid option and exit.
  7. After parsing, "$@" should be removed of all parsed flags and the -- end-of-options indicator.

Parsing by hand

args=()
while (( $# )); do
	case $1 in
	--)
		shift
		args+=("$@")
		break
	;;
	-a|--arg)
		if (( $# > 1)); then
			arg_val="$2"
			shift
		else
			echo >&2 "Missing argument for $1"
			exit 1
		fi
	;;
	-F|--foo) foobar=foo ;;
	-B|--bar) foobar=bar ;;
	-f|--flag) flag=1 ;;
	-*)
		echo >&2 "Invalid option: $1"
		exit 1
	;;
	*) arg+=("$1") ;;
	esac
	shift
done
set -- "${args[@]}"

getopts: Basically parsing by hand

Using getopts usually goes like this:

while getopts a:fFB opt; do
	case $opt in
	a) arg_val="$OPTARG" ;;
	f) flag=1 ;;
	F) foobar=foo ;;
	B) foobar=bar ;;
	'?')
		echo >&2 "Invalid option: ${@:$OPTIND:1}"
		exit 1
	;;
	esac
	shift $OPTIND-1
done
[[ $1 = -- ]] && shift

Now, while this looks much more compact (there is less error handling) you may notice the lack of a -- case. This is because getopts will stop at --, the end of the parameters, or any other non-option value, failing requirement 1. To support this, this loop would have to be modified to additionally test what parameter is next after getopts fails. This would make the parsing more complex than doing it by hand.

Oh, and did I mention that getopts only supports short (-a) flags and options?

getopt: The powerhouse of parsing?

Using getopt goes something like this:

if ! temp="$(getopt -o a:fFB  --long arg:,flag,foo,bar -n "$0" -- "$@")"; then
	# getopt found an unrecognized option
	exit 1
fi
eval set -- "$temp"
while :; do
	case $1 in
	-a|--arg)
		arg_val=$2
		shift
	;;
	-f|--flag) flag=1 ;;
	-F|--foo) foobar=foo ;;
	-B|--bar) foobar=bar ;;
	--)
		shift
		break
	;;
	*)
		echo >&2 "Error parsing arguments"
		exit 1
	;;
done

The value of getopt is its reordering the parameters so we can guarantee to see all our options first and all our parameters second, split up in an easy-to-parse way.

getopt also supports GNU-style options. The program above can be called as:

./myprogram param --arg=value -aval

And getopt will print out --arg 'value' -a 'val' -- param.

It is only half of a solution (validation and handling mixed parameters), but it does that half very well. getopt prints all the options, then --, then all the parameters.

This is clearly more powerful than getopts, but because it isn’t built into the shell, it must rely on eval set -- to pass information back and cannot set parameters for us.

zparseopts: Both halves

Using zparseopts looks like this:

# After Zsh 5.8
zmodload zsh/zutil
zparseopts -D -E -F - a:=arg_val -arg:=arg_val f=flag -flag=flag \
	F=foobar -foo=foobar B=foobar -bar=foobar || exit 1

# Get index of first end of options indicator
((rmidx=$@[(i)(--|-)]))
set -- "${@[0,rmidx-1]}" "${@[rmidx+1,-1]}"

The -F flag was added very recently, without it more work needs to be done to exit on invalid flags:

# Before Zsh 5.8
zmodload zsh/zutil
zparseopts -D -E - a:=arg_val -arg:=arg_val f=flag -flag=flag \
	F=foobar -foo=foobar B=foobar -bar=foobar

if (( $# )); then
	rmidx=$@[(i)(--|-)]
	if [[ -n ${invalid_opt::=${(M)@[0,rmidx-1]#-}} ]]; then
		echo >&2 "Invalid options: $invalid_opt"
		exit 1
	fi
	set -- "${@[0,rmidx-1]}" "${@[rmidx+1,-1]}"
fi

zparseopts, as a builtin, can (and does) change local parameters. Run either of these with ./myprogram --arg val --flag --foo -B and you will find the following parameters set:

arg_val=(--arg val)
flag=(--flag)
foobar=(-B)

…So, depending on how you prefer your flags stored, you may wish to change some things:

foobar=${${foobar/(-F|--foo)/foo}/(-B|--bar)/bar}
flag=${#flag}
arg_val=${arg_val[-1]}

But you may not need to do this! You could match against (--foo|-F) and (--bar|-B) later in the program. If arg_val needs to be a list of values, a la ./myprogram -a val1 --arg val2, you can support this with zparseopts - a:+=arg_val -a:=arg_val, and then step through them like this:

for flag val in "${(@)arg_val}"; do
	# $flag is '-a' or '--arg'
	do_something_to "$val"
done

Or do two types of arguments with one array:

zparseopts -D -E - a:+=a_and_b b:+=a_and_b

for flag val in "${(@)a_and_b}"; do
	case $flag in
	-a) do_something_a_to "$val" ;;
	*)  do_something_b_to "$val" ;;
	esac
done

The details

Now, you may be wondering about -D and -E.

Well, -D removes all the matched options from the parameter list, (supporting requirement 7) and -E tells zparseopts to expect options and parameters to be mixed in (supporting requirement 1; without it, it will stop like getopts does).

What I find nice about zparseopts is that semantics like overriding vs stacking flags can be defined in the command, rather than managed after parsing.

Here is a stacking example: -v increases verbosity, and -q decreases it:

zparseopts -D -E - v+=flag_v -verbose+=flag_v q+=flag_q -quiet+=flag_q
(( verbosity = $#flag_v - $#flag_q ))

The verdict

I’ve glossed over some details; there is more to both getopt and zparseopts. zparseopts has a couple more niche flags, and getopt has a flag that let it support single leading dashes for long options.

That said, zparseopts does most of your work for you, and it doesn’t require your shell to fork out for an external program like getopt. I highly recommend using it.