tl;dr: Always use the flags -D
and -E
(and -F
if available).
Add a colon to indicate an argument is required
(similar to getopt)
The problem:
So… you want to write a fancy program, with its fancy command-line arguments which control its fancy behavior.
You want to define fancy flags which take arguments, less fancy flags which don’t, and complain when you find any invalid flags.
Now you could parse those fancy arguments by hand, but this is a common problem, so surely it is a solved problem right?
The candidates:
Meet our candidates:
The POSIX-specified getopts
,
util-linux’s getopt
,
and Zsh’s zparseopts
.
Before we get into it, let’s make some requirements:
- Arguments may be mixed in with flags
- The flag
-a
/--arg
takes an argument, store in$arg
. - The presence of flag
-f
/--flag
should be stored in$flag
. - The two flags
-F
/--foo
and-B
/--bar
override each other, store in$foobar
. --
denotes end-of-options.- If an unknown flag is provided before
--
, print the invalid option and exit. - After parsing,
"$@"
should be removed of all parsed flags and the--
end-of-options indicator.
Parsing by hand
args=()
while (( $# )); do
case $1 in
--)
shift
args+=("$@")
break
;;
-a|--arg)
if (( $# > 1)); then
arg_val="$2"
shift
else
echo >&2 "Missing argument for $1"
exit 1
fi
;;
-F|--foo) foobar=foo ;;
-B|--bar) foobar=bar ;;
-f|--flag) flag=1 ;;
-*)
echo >&2 "Invalid option: $1"
exit 1
;;
*) arg+=("$1") ;;
esac
shift
done
set -- "${args[@]}"
getopts
: Basically parsing by hand
Using getopts
usually goes like this:
while getopts a:fFB opt; do
case $opt in
a) arg_val="$OPTARG" ;;
f) flag=1 ;;
F) foobar=foo ;;
B) foobar=bar ;;
'?')
echo >&2 "Invalid option: ${@:$OPTIND:1}"
exit 1
;;
esac
shift $OPTIND-1
done
[[ $1 = -- ]] && shift
Now, while this looks much more compact (there is less error handling)
you may notice the lack of a --
case.
This is because getopts
will stop at --
, the end of the parameters,
or any other non-option value,
failing requirement 1.
To support this, this loop would have to be modified
to additionally test what parameter is next after getopts
fails.
This would make the parsing more complex than doing it by hand.
Oh, and did I mention that getopts
only supports short (-a
) flags and options?
getopt
: The powerhouse of parsing?
Using getopt
goes something like this:
if ! temp="$(getopt -o a:fFB --long arg:,flag,foo,bar -n "$0" -- "$@")"; then
# getopt found an unrecognized option
exit 1
fi
eval set -- "$temp"
while :; do
case $1 in
-a|--arg)
arg_val=$2
shift
;;
-f|--flag) flag=1 ;;
-F|--foo) foobar=foo ;;
-B|--bar) foobar=bar ;;
--)
shift
break
;;
*)
echo >&2 "Error parsing arguments"
exit 1
;;
done
The value of getopt
is its reordering the parameters
so we can guarantee to see all our options first and all our parameters second,
split up in an easy-to-parse way.
getopt
also supports GNU-style options.
The program above can be called as:
./myprogram param --arg=value -aval
And getopt
will print out --arg 'value' -a 'val' -- param
.
It is only half of a solution (validation and handling mixed parameters), but it does that half very well.
getopt
prints all the options, then --
, then all the parameters.
This is clearly more powerful than getopts
,
but because it isn’t built into the shell,
it must rely on eval set --
to pass information back
and cannot set parameters for us.
zparseopts
: Both halves
Using zparseopts
looks like this:
# After Zsh 5.8
zmodload zsh/zutil
zparseopts -D -E -F - a:=arg_val -arg:=arg_val f=flag -flag=flag \
F=foobar -foo=foobar B=foobar -bar=foobar || exit 1
# Get index of first end of options indicator
((rmidx=$@[(i)(--|-)]))
set -- "${@[0,rmidx-1]}" "${@[rmidx+1,-1]}"
The -F
flag was added very recently,
without it more work needs to be done to exit on invalid flags:
# Before Zsh 5.8
zmodload zsh/zutil
zparseopts -D -E - a:=arg_val -arg:=arg_val f=flag -flag=flag \
F=foobar -foo=foobar B=foobar -bar=foobar
if (( $# )); then
rmidx=$@[(i)(--|-)]
if [[ -n ${invalid_opt::=${(M)@[0,rmidx-1]#-}} ]]; then
echo >&2 "Invalid options: $invalid_opt"
exit 1
fi
set -- "${@[0,rmidx-1]}" "${@[rmidx+1,-1]}"
fi
zparseopts
, as a builtin, can (and does) change local parameters.
Run either of these with ./myprogram --arg val --flag --foo -B
and you will find the following parameters set:
arg_val=(--arg val)
flag=(--flag)
foobar=(-B)
…So, depending on how you prefer your flags stored, you may wish to change some things:
foobar=${${foobar/(-F|--foo)/foo}/(-B|--bar)/bar}
flag=${#flag}
arg_val=${arg_val[-1]}
But you may not need to do this!
You could match against (--foo|-F)
and (--bar|-B)
later in the program.
If arg_val
needs to be a list of values, a la ./myprogram -a val1 --arg val2
,
you can support this with zparseopts - a:+=arg_val -a:=arg_val
,
and then step through them like this:
for flag val in "${(@)arg_val}"; do
# $flag is '-a' or '--arg'
do_something_to "$val"
done
Or do two types of arguments with one array:
zparseopts -D -E - a:+=a_and_b b:+=a_and_b
for flag val in "${(@)a_and_b}"; do
case $flag in
-a) do_something_a_to "$val" ;;
*) do_something_b_to "$val" ;;
esac
done
The details
Now, you may be wondering about -D
and -E
.
Well, -D
removes all the matched options from the parameter list,
(supporting requirement 7)
and -E
tells zparseopts
to expect options and parameters to be mixed in
(supporting requirement 1; without it, it will stop like getopts
does).
What I find nice about zparseopts
is that semantics
like overriding vs stacking flags can be defined in the command,
rather than managed after parsing.
Here is a stacking example:
-v
increases verbosity,
and -q
decreases it:
zparseopts -D -E - v+=flag_v -verbose+=flag_v q+=flag_q -quiet+=flag_q
(( verbosity = $#flag_v - $#flag_q ))
The verdict
I’ve glossed over some details;
there is more to both getopt
and zparseopts
.
zparseopts
has a couple more niche flags,
and getopt
has a flag that let it support single leading dashes for long options.
That said, zparseopts
does most of your work for you,
and it doesn’t require your shell to fork out for an external program like getopt
.
I highly recommend using it.