CommandlineParsing
When processing command line arguments applications commonly perform the following basic conceptual steps.
- Parse the arguments into an analyzable form including identifying options/switches with values
- Bind the results of the parse to a data structure that is then used by the application to adapt behavior based on the command line arguments.
The command line parsing library splits the argument processing into these two basic steps. Furthermore, the actual parsing is isolated by an interface so that multiple implementations are allowed. The first implementation uses the Sprache Parser Combinator Library, which is a light weight (<50KB) parsing library useful for many applications. Other parsers are possible if an application already needs to use a more traditional parser engine, but given the size of the Sprache library it isn't likely to save much.
The parsing stage produces an ordered list of arguments that are either switches/options with a possible value and general positional arguments that may be quoted to include whitespace. For some applications the list is all that is needed. However, for others an object model is desired. This is where the binding comes in. The binding stage takes the list of arguments and, with the aid of .NET attributes binds the values from the command line to properties of an object instance. To allow for greater flexibility, the actual property binding is isolated behind an interface to allow multiple implementations. The library includes a standard .NET Reflection based property provider which is likely to serve for most uses. Though, it is plausible to implement a compile time code generator that created a property provider that didn't need reflection, making it suitable for .NET Native AOT compilation scenarios where minimizing reflection is desired.
Key Features
- Flexible parsing
- Flexible binding
Flexible Parsing
The commandLineParsing library takes a generalized approach to parsing command line arguments. While there are a large variety of styles for providing arguments they all tend to share some things in common. Generally speaking the grammar of command lines looks something like this:
PositionalArg = QuotedValue | UnquotedValue;
Option = SWITCH, Identifier [DELIM, (QuotedValue | UnquotedValue ) ];
args = {PositionalArg | Option };
One challenge with any general library for command line arguments is in determining what to allow for SWITCH
and DELIM. Generally, SWITCH is either a -
, --
or /
though, for some applications multiple forms are
allowed and each may have distinct meanings or even namespaces for the options. Furthermore, many command line
styles use different delimiters for options that accept a value. Common delimiters are :
and =
though some
applications use whitespace as a delimiter, which poses a real challenge as the parser doesn't know which options
even allow a value.
To support all the variations a generalized parsing library requires some level of abstraction to achieve the needed flexibility. The CommandLineParsing library provides this by using a approach that can parse all of the variations and leaves the correctness to the application or binder. That is, the parser accepts all valid command lines and some that are not valid for a given application. This, helps keep the application logic simpler and allows for defining common semantic processing and object binding for specific scenarios.
Flexible Binding
Parsing arguments produces an immutable list of ICommandlineArgument which is a common interface for either CommandlineOption or CommandlineValue. The application can work with that list directly or use a binder to bind the parsed arguments to properties of an object instance. The CommandlineBinder class provides common logic for walking the list of arguments to bind the properties to an object instance. The actual binding of the value to the property is performed by an implementation of IOptionProperty. Instances of IOptionProperty are provided by an implementation of another interface IOptionPropertyProvider which, given a CommandlineOption, will look up the property for the object to bind to and provides the IOptionProperty implementation to do the binding.
Note
The design the IOPtionPropertyProvider and IOptionProperty interfaces intentionally does not require the use of refelction, though it is allowed. In fact the default provider is ReflectionOptionPropertyProvider. The reflection provider covers the large majority of cases. However it isn't the only possible implementation. It is plausible to use some form of compile time reflection/AOP Weaver to generate an implementation of IOptionPropertyProvider for a given options class that does not require any run-time reflection.
Example
The following example comes from the unit tests for the CommandlineParsing library and shows many of the capabilities for handling complex options parsing and binding to an options class the application can use.
// Copyright (c) Ubiquity.NET Contributors. All rights reserved.
// Licensed under the MIT license. See the LICENSE.md file in the project root for full license information.
#pragma warning disable SA1649, SA1402, SA1202, SA1652
using System.Collections.Generic;
using System.ComponentModel;
using Microsoft.VisualStudio.TestTools.UnitTesting;
namespace Ubiquity.CommandlineParsing.Monad.UT
{
[DefaultProperty( "PositionalArgs" )]
internal class TestOptions
{
public enum Option3Values
{
Foo,
Bar,
Baz
}
public List<string> PositionalArgs { get; } = new List<string>( );
[CommandlineArg( "o1", AllowSpaceDelimitedValue = true )]
public string Option1 { get; set; }
public string Option2 { get; set; }
// standard converter for enums should not need to be explicitly declared so test that.
// [TypeConverter(typeof(EnumConverter))]
public Option3Values Option3 { get; set; }
[CommandlineArg( "o4" )]
public bool Option4 { get; set; }
[CommandlineArg( "m" )]
public IList<string> MultiOption { get; } = new List<string>( );
}
[TestClass]
public class BinderTests
{
// This includes all forms of option switches quoting and the special problematic case of a trailing \ in a quoted string
// The trailing \ in a quoted string is a notorious hidden gotcha for .NET apps as the default .NET arg parsing generates
// a quote character as it implements character escaping, unlike any other runtime.
private readonly string[] FullCommandLine =
{
@"positionalarg0",
@"-m:""Multi 1""",
@"--o1",
@"""space delimited value1""",
@"-MultiOption=""Multi 2""",
@"positional1",
@"-Option2=""this is a test""",
@"/option3:baz",
@"-m:multi3",
@"-o4",
@"positional2",
@"""positional 3\"""
};
[TestMethod]
public void CommandLineBinderTest()
{
var parser = new Parser( );
TestOptions options = new TestOptions( ).BindArguments( parser.Parse( FullCommandLine ) );
Assert.IsNotNull( options );
Assert.AreEqual( 4, options.PositionalArgs.Count );
Assert.AreEqual( 3, options.MultiOption.Count );
Assert.AreEqual( "space delimited value1", options.Option1 );
Assert.AreEqual( "this is a test", options.Option2 );
Assert.AreEqual( TestOptions.Option3Values.Baz, options.Option3 );
Assert.IsTrue( options.Option4 );
}
}
}
Note
This example treats -
, --
and /
as equivalent, though other behavior is possible
by providing an instance of ReflectionOptionPropertyProvider, or some other implementation
of IOptionPropertyProvider, to the binder that will select the appropriate property or reject the parsed arguments as appropriate.