Skip to main content

Clavis Rebooted: Secure, Type-Safe URLs for ASP.NET

A few years ago, I wrote about a web security microframework for ASP.NET which provided a few primitives for secure parameter-passing and navigation. I've just released a public alpha on Nuget for anyone who's willing to try it.

The previous article covered the theoretic foundation of Clavis well enough, but it has undergone a few small revisions to make it easier to use and integrate more seamlessly with ASP.NET. This post will serve as an end-user introduction to Clavis, the rationale behind the design decisions, and the benefits it provides. As a brief summary to whet your appetite, here are the advantages that the Clavis library provides for an otherwise standard ASP.NET web forms or MVC project:

  • By default, URLs are derived from types, so the compiler ensures that every page that will be displayed actually exists. The default URL generated can be overridden via an attribute.
  • Declarative specification of the types and number of parameters a page accepts, which the compiler checks for you. Any type can appear in this specification.
  • Query string parameter names are managed for you, so you rarely need to mess about with parameter strings. In fact, parameter strings never appear anywhere unless you want to override the default parameter name.
  • Taint-checking by declarative specification of unprotected page parameters, which the compiler ensures you handle properly.
  • Incremental integration into existing projects, so adopting Clavis is not an all-or-nothing proposition.

If any of this interests you, please read on.

Overview

Query Parameters: The Problems

The web provides a few mechanisms to pass data to servers. The most commonly used are probably query parameters, like http://google.com/search?q=http+query+string. This is a fairly straightforward means of providing the server with named parameters carrying data. You can view an HTTP request as a function call that returns some content, and the query parameters as the named arguments for that function.

This mental model actually works pretty well. An HTTP server generates content with embedded function calls that the client's browser invokes to access more content. Unfortunately, the limitations of query parameters become immediately obvious. Because function invocations are transparent, malicious clients can easily alter server-specified parameter values. Sometimes this behaviour is allowed, like the previous google query example, but there are many scenarios where we don't want clients to be able to make such changes. For instance, we often want to specify query parameters that happen to be easily guessable data layer identifiers for objects, but because these parameters can be changed and are easily guessable, clients can easily obtain access to data they shouldn't.

Unfortunately, there's no simple means of preventing clients from changing such parameters, which means raw query parameters can't be used to carry trusted content. Cookies and sessions were then invented to address some of the limitations of query parameters, but they carry their own set of problems.

Implicit Sessions: The Problems

Fundamentally, most site problems with scaling, usability and security can be traced back to the implicit sessions that ASP.NET creates for you.

The security problems are well known: the session encourages a style of server-side parameter passing that immediately opens you up to cross-site request forgery attacks (CSRF). This is typically addressed by adding a new security mechanism to close this hole, but this is arguably the wrong approach which is endemic to some security practices. Instead, don't "add security", remove insecurity.

The scaling problem is simply due to the fact that the session itself can't be part of the presentation layer since it's stateful, but it also doesn't live in the data layer and so doesn't benefit from the robustness and replication of that layer either. In order to scale sessions, they require all the same mechanisms of a data layer which seemingly defeats the purpose of separating it from the data layer to begin with. However, session state is typically not placed in the data layer because it's not part of the domain model. Session state logically belongs to the navigation/interaction component of the presentation layer, so most developers don't want to "pollute" their domain model.

The usability problems are simple: sessions are often stored in cookies which are shared across different browser instances. It's thus far too easy to introduce surprising behaviour for users that view multiple pages of your site at once. The more state you store in the session, the more likely this will occur. For instance, it seems pretty common to store an object instance that's being "edited" in the session, and I was guilty of this too when I first started using ASP.NET because it's so convenient. The immediate implication is that a user can't edit two instances of the same class type at the same time in the same browser. Furthermore, a server upgrade or restart would reset these sessions so that users lose all of their changes.

This is often addressed by out-of-process sessions, but this introduces a whole new set of problems due to serialization. For instance, if the object being edited is one that was upgraded in an incompatible way, deserialization won't succeed so the user can't proceed where they left off. Also, the object could contain unserializable state. For instance, NHibernate lazy references and collections contain an embedded reference to a particular SQL connection. This connection can't be sent out of process, and whatever object instance is deserialized on a subsequent request can't be easily connected to the new SQL connection.

The same serialization and state lifecycle problems occur for sessions stored in an SQL database, although at least these benefit from data layer replication.

The Solution

While the prospects may seem rather bleak at this point, we need only revisit the options to see if there's a simpler alternative that addresses the requirements. For reasons explained in my previous post, the right way to send this sort of data to the server are query parameters, and not cookies and sessions which introduce far too many complications of their own. We only need some standard mechanism to ensure that certain query parameter values can't be changed.

In fact, a simple mechanism for preventing message tampering is already known: the HMAC. So basically, we just need to include an HMAC of the protected query parameters in every URL, and the server can easily ensure that those query parameters are unchanged on all subsequent requests.

The URL must have some standard form so that this checking can be automated by the framework, and in Clavis this URL takes the form of an additional query parameter named "clavis". A site URL that previously looked like:

http://host.com/foo?param1=bar

where 'param1' should be protected from tampering, will now look something like:

http://host.com/foo?-param1=bar&clavis=ab52H_6jKVeyH4h8

You can see above that Clavis prefixes the query parameter with a '-' character in order to distinguish the protected from unprotected parameters. Clavis currently utilizes RIPEMD-160, with the upper 80 bits folded into the lower 80 bits to generate the "clavis" parameter. The 80 bits are encoded using a URL-safe base64 encoding. That makes the clavis parameter 16 characters, which is sufficiently short to be human-readable, but sufficiently unguessable for most purposes. Future updates may make the the choice of HMAC and parameter length configurable.

A web app using Clavis need only call Continuation.Validate() prior to processing the request, and if no exception is thrown then the protected parameters were unchanged. As long as all application state between pages is passed via protected query parameters, this also makes your app automatically immune to CSRF.

In reality, your app will be immune to CSRF as long as state that identifies the user is protected in this manner. Even if attackers can guess the user identifier, they can't generate a valid URL because they don't know the server-side private key used in the HMAC, and so CSRF is impossible.

Data Types

Logically, there are 3 types of values passed as query parameters: basic data values (string, int, etc.), basic data values that are proxies for server-side objects (like database keys), and lists of either of the previous two types. These are precisely the basic data types provided by Clavis. Basic data values are simply the primitive CLR types that implement IConvertible. For instance, here is a page accepting an int, a string and a decimal:

public class Foo : IContinuation<int, string, decimal>
{
  ...
}

If you don't know what a continuation is, don't worry about the specifics. You just need to understand that a page specification is declared via IContinuation<...> with the parameter types filled in. By default, all parameter types specified in an IContinuation<...> declaration are assumed to be protected. If we want to allow the client to change the decimal value, then we simply wrap it in Unsafe<T> like so:

public class Foo : IContinuation<int, string, Unsafe<decimal>>
{
  ...
}

The Unsafe<T> wrapper declares the value to be unprotected, and so it will not be included in the HMAC. Continuations containing no protected parameters, ie. all unsafe parameters or no parameters, will not generate a "clavis" HMAC parameter, so you can always generate semantically meaningful URLs when you really need them. This also makes it easy to support forms submitted via the HTTP GET method.

Suppose we want Foo to accept a list of strings. The declaration will now look like this:

public class Foo : IContinuation<int, IEnumerable<string>, Unsafe<decimal>>
{
  ...
}

List parameters generate multiple entries in the query string, as is the standard approach with URLs. For instance, one possible URL for the above Foo might be:

~/Foo?-Int32=0&-String=foo&-String=bar&-String=hello+world&Decimal=3.14

For an explanation of how the parameter names are generated, see the section below on URL generation. You can also nest list and unsafe declarations, so an unsafe list of string parameters will look like:

public class Foo : IContinuation<int, Unsafe<IEnumerable<string>>, Unsafe<decimal>>
{
  ...
}

Finally, Foo can also implement multiple continuation types:


public class Foo : IContinuation<int, Unsafe<IEnumerable<string>>, Unsafe<decimal>>
                 , IContinuation<DateTime, Unsafe<char>>
{
  ...
}

Note that the above continuation parameters can specify any type, since this declaration is supposed to be a logical specification of the parameters Foo accepts. For instance, here's a Foo that accepts InventoryItem and Customer objects:


public class Foo : IContinuation<InventoryItem, Customer>
{
  ...
}

The actual representation of those parameters when generating URLs is specified separately.

The parameters that are not IConvertible can also exploit proxy values, which are basic IConvertible values that represent the server-side non-IConvertible object, like InventoryItem above. This proxy type is Clavis.Id<TProxy, TType>, which is basically a logical identifier of type TProxy that designates an object instance of type TType. For instance, a Customer with an integer identifier 1234 has an Id<int, Customer> = 1234.

The Id<TProxy, TType> type shouldn't appear in a continuation specification, although you can if you really want to. It's typically used only inside the continuation when accepting/parsing URL parameters for non-IConvertible types. See below for more details.

In summary, any type can be used in a continuation declaration, and the special types Unsafe<T> and IEnumerable<T> hold special meaning in Clavis.

URL Generation

Clavis autogenerates URLs by convention based on the fully qualified type name. The translation is simply:

Some.Namespace.SomeClass+Inner => /Some/Namespace/SomeClass+Inner?

The assembly is not included because Clavis assumes that an external component actually resolves paths to concrete instances, which is consistent with standard ASP.NET conventions.

You generate a URL by calling the Continuation.ToUrl() overloads with the required parameters:


Continuation.ToUrl<Foo, InventoryItem, Customer>(
  item.AsParam(item.Id),
  customer.AsParam(customer.Id));

Alternately, you could use Continuation.Params().ToUrl(), which exploits a little more of C#'s type inference and so is perhaps more convenient:


Continuation.Params(item.AsParam(item.Id),customer.AsParam(customer.Id))
            .ToUrl<Foo>();

Note how InventoryItem and Customer must specify an IConvertible type as a proxy/representation for the query param, because they are not IConvertible themselves. Parameters ultimately all reduce to IConvertible primitive types, and you will receive a compile-time error if you try to specify a non-IConvertible type as a representation.

By default, Clavis generates the URL parameter names based on the class name, so the above two URLs will look like:

~/Foo?-InventoryItem=123&-Customer=456&clavis=asd1gh823mP

Each Param.AsParam() overload accepts an optional "key" parameter to override the parameter name, but this requires ensuring that the same name is used everywhere which can become rather inconvenient. It's much more convenient to just let Clavis handle naming for you wherever possible, and you can control parameter naming via class names which are checked by the compiler.

Processing Parameters

Now that we know how to specify continuations and their parameters, and we know how to create continuation URLs, let's see how URL parameters are actually processed inside a continuation. Clavis provides some overloaded static methods for extracting parameters, which are of the form Param.TryParseX<T>(out T value), where X is the index of the parameter in the continuation specification. Consider the following continuation:

public class Counter: System.Web.Page, IContinuation<int>
{
  int counter;
  ...
}

In the page's OnInit method, we can extract the integer parameter like so:

public class Counter: System.Web.Page, IContinuation<int>
{
  int counter;
  override protected void OnInit(EventArgs e)
  {
    if (this.TryParse0(out counter))
      this.lblMessage.Text = "Current: " + counter;
    else
      this.lblMessage.Text = "New counter: 0";

    base.OnInit(e);
  }
}

TryParse0 means "parse parameter 0 of the continuation specification", which in this case is of type Int32. TryParseX returns true if the query parameter exists and was successfully parsed, else it returns false. If we were trying to process the Nth continuation parameter, then we would call TryParseN.

As you can see, Clavis handles all the tedious IConvertible parameter parsing for you. In the case of non-IConvertible types, we utilize Id<TProxy, TType>:

public class CustomerDetails : Page, IContinuation<Customer>
{
  Customer cust;
  override protected void OnInit(EventArgs e)
  {
    Id<int, Customer> custId;
    if (this.TryParse0(out custId))
      cust = SomeDb.Customers.Single(x => x.Id == custId.Key);
    else
      throw new InvalidOperationException("No customer specified!");

    base.OnInit(e);
  }
}

Note again that Clavis handles the tedious parsing of the proxy value for you, and leaves it up to you to load the instance via Linq, NHibernate, etc. Sometimes parsing failure can be treated as an error like above, but when pages implement multiple continuation types, it's not necessarily an error:

public class Generic : Page, IContinuation<Customer>, IContinuation<Product>
{
  Customer cust;
  Product prod;
  override protected void OnInit(EventArgs e)
  {
    Id<int, Customer> custId;
    if (this.TryParse0(out custId))
      cust = SomeDb.Customers.Single(x => x.Id == custId.Key);

    Id<int, Product> prodId;
    if (this.TryParse0(out custId))
      prod = SomeDb.Products.Single(x => x.Id == prodId.Key);

    // NOTE: could optionally throw error here
    //if (cust == null && prod == null) throw new ArgumentNullException("prod or cust");

    base.OnInit(e);
  }
}

Clavis can even handle nesting, like parsing unsafe lists of object identifiers:

public class Generic: Page,IContinuation<Unsafe<IEnumerable<Customer>>>
{
  IEnumerable<Customer> customers;
  override protected void OnInit(EventArgs e)
  {
    Unsafe<IEnumerable<Id<int, Customer>>> custIds;
    if (this.TryParse0(out custIds))
    {
      var ids = custIds.Value.Select(x => x.Key);
      customers = SomeDb.Customers.Where(x => ids.Contains(x.Id));
    }
    base.OnInit(e);
  }
}

Conclusion

ASP.NET isn't the ideal web framework, but I've used Clavis in a few projects to automate some of the tedium and eliminate some common errors and security holes in such applications. Because URLs are derived from types, the compiler ensures that programs have no dangling page references. Furthermore, because unsafe/unprotected parameters that the client can change are identified by a distinct type, the compiler also ensures that you're aware of every point a potentially unsafe value is used.

Finally, you can declare a high-level specification of the types a page accepts as parameters, and the compiler ensures that any pages that redirect to this page call it with the appropriate number of parameters, with the correct types, and it automatically generates the parameter names for you so you don't have to mess about with strings.

All of this is achieved in a piecemeal fashion, so you don't have to adopt Clavis within an entire project all at once. You can instead just convert one page at a time, updating all pages that redirect to it.

Full API documentation is available online, or as a downloadable .CHM file. The latest Clavis version can be downloaded here, or via Nuget.

For support, see the Clavis Trac server where all development can be tracked. The ticket system is open for bug reports, requests or questions.

Comments

Popular posts from this blog

async.h - asynchronous, stackless subroutines in C

The async/await idiom is becoming increasingly popular. The first widely used language to include it was C#, and it has now spread into JavaScript and Rust. Now C/C++ programmers don't have to feel left out, because async.h is a header-only library that brings async/await to C! Features: It's 100% portable C. It requires very little state (2 bytes). It's not dependent on an OS. It's a bit simpler to understand than protothreads because the async state is caller-saved rather than callee-saved. #include "async.h" struct async pt; struct timer timer; async example(struct async *pt) { async_begin(pt); while(1) { if(initiate_io()) { timer_start(&timer); await(io_completed() || timer_expired(&timer)); read_data(); } } async_end; } This library is basically a modified version of the idioms found in the Protothreads library by Adam Dunkels, so it's not truly ground bre

Building a Query DSL in C#

I recently built a REST API prototype where one of the endpoints accepted a string representing a filter to apply to a set of results. For instance, for entities with named properties "Foo" and "Bar", a string like "(Foo = 'some string') or (Bar > 99)" would filter out the results where either Bar is less than or equal to 99, or Foo is not "some string". This would translate pretty straightforwardly into a SQL query, but as a masochist I was set on using Google Datastore as the backend, which unfortunately has a limited filtering API : It does not support disjunctions, ie. "OR" clauses. It does not support filtering using inequalities on more than one property. It does not support a not-equal operation. So in this post, I will describe the design which achieves the following goals: A backend-agnostic querying API supporting arbitrary clauses, conjunctions ("AND"), and disjunctions ("OR"). Implemen

Easy Automatic Differentiation in C#

I've recently been researching optimization and automatic differentiation (AD) , and decided to take a crack at distilling its essence in C#. Note that automatic differentiation (AD) is different than numerical differentiation . Math.NET already provides excellent support for numerical differentiation . C# doesn't seem to have many options for automatic differentiation, consisting mainly of an F# library with an interop layer, or paid libraries . Neither of these are suitable for learning how AD works. So here's a simple C# implementation of AD that relies on only two things: C#'s operator overloading, and arrays to represent the derivatives, which I think makes it pretty easy to understand. It's not particularly efficient, but it's simple! See the "Optimizations" section at the end if you want a very efficient specialization of this technique. What is Automatic Differentiation? Simply put, automatic differentiation is a technique for calcu