Design Meeting Notes - February 7, 2013

DbContext and transactions

Continuing the discussion from last week where the following pattern was suggested:

using (var context = new MyContext())
{
    using (var transaction = context.Database.BeginTransaction())
    {
        // Do some stuff
        transaction.Commit();
    }
}

Part of the niceness here is that the connection opening is managed. That is, if the connection is not open, then it is opened as part of the BeginTransaction call. However, if BeginTransaction did open the connection, then it should also close the connection when its using scope ends—i.e. when the object it returns is disposed. This leads to the question of what type is the object that is returned? Options:

  • It could be the raw, store DbTransaction object returned from DbConnection.BeginTransaction. However there is no way to be notified when this object is disposed, so there would be no way to close the connection. We could just not close the connection, but this seems both “wrong” and could result in connections being open for a long time if the context doesn’t own the connection or if the context is not disposed in short order.
  • It could be a new type that extends from DbTransaction and that wraps the existing DbTransaction—Decorator pattern. The problem here is that app may need the underlying store DbTransaction in order to pass it to things like the DbCommand.Transaction property. This means that there should be a way to get from the type we return to the underlying store transaction. We would probably do this by adding a property to the new type, making the type public, and having BeginTransaction explicitly return our new type.
  • We could follow the pattern described in the previous bullet, but make the new type not inherit from DbTransaction. Instead it would be an explicitly new type concerned with managing the transaction and connection together. It would have a property for the store transaction and methods for commit and rollback, but probably nothing else.

We will likely go with the third option, although it has been hard to come up with a good name for the type and so this remains an open question. The shape of the class will be something like this:

public class DbNoNameYet : IDisposable
{
    internal DbNoNameYet() {}
    public void Commit()
    public void Rollback()
    public void Dispose()
    public DbTransaction Transaction { get; }
}

Some common usages:

using (var context = new MyContext())
{
    // Use the provider's default isolation level
    using (var transaction = context.Database.BeginTransaction())
    {
        // Do some stuff
        transaction.Commit();
    }
}

using (var context = new MyContext())
{
    // Use a specific isolation level
    using (var transaction = context.Database.BeginTransaction(IsolationLevel.Chaos))
    {
        // Do some stuff
        transaction.Commit();
    }
}

using (var context = new MyContext())
{
    using (var transaction = context.Database.BeginTransaction())
    {
        // Optionally do some stuff with context
        
        // Do some raw stuff on a command
        var myCommand = context.Database.Connection.CreateCommand();
        myCommand.Transaction = transaction.Transaction;
        myCommand.CommandText = "Foo";
        myCommand.ExecuteNonQuery();

        // Optionally do some more stuff with context

        transaction.Commit();
    }
}

Note that when the type returned from BeginTransaction is disposed we can do the equivalent of UseTransaction(null) such that EF will know that for future operations on that context it needs to create and use a new transaction. (Note that UseTransaction is the new name for the SetTransaction method that was mentioned in notes from the last meeting.) For example:

using (var context = new MyContext())
{
    using (var transaction = context.Database.BeginTransaction())
    {
        // Do some stuff
        transaction.Commit();
    }

    // Do some more stuff
    context.SaveChanges(); // New transaction used
}

We also considered whether or not connection opening could be done lazily only when the transaction is actually needed? This adds complexity and may be unexpected by somebody accessing the connection, so it probably should be opened immediately.

Pull request: extending Migrations

There was a brief discussion about his pull request: http://entityframework.codeplex.com/SourceControl/network/forks/iceclow/efmigrationextensions/contribution/4023

The concepts in the pull request seem basically sound, but adding AddOperation to the public surface seems inappropriate given how it and the API are used. It feels like it should remain internal. We could instead have an interface that is explicitly implemented so that the normal API surface is not polluted, possibly together with extension methods. Also, it would be good if the dispatch could be done using dynamic instead of Reflection. This will be investigated a bit more and a response will be made to the contributor.

Issues with SQL Server spatial types

The change to use buffering (BufferedDataReader) causes some issues with spatial types:

  • Perf regressions when spatial types are included in the query. The root cause of these regressions are not fully understood, but part of it is due to not using GetSqlBytes (see below.)
  • A breaking change at the provider level whereby we now pass a BufferedDataReader instead of the store data reader. If the provider casts to the store data reader type then this will now fail.
  • A breaking change due to the way SQL Server spatial types are handled. See below.

Background on how SQL Server spatial types interact with EF. (Yes, this is ugly.)

How EF spatial types work:

  • All methods delegate to the DbSpatialServices implementation
  • The DbSpatialServices is resolved lazily when the first instance is instantiated
    • Uses GetService<DbSpatialServices>() if not null
    • Else if there's a SQL spatial assembly that can be loaded with Assembly.Load uses the SqlSpatialServices singleton
    • Else uses DefaultSpatialServices

How SqlSpatialServices works:

  • Lazily finds the latest version of the SQL Server types assembly and loads it
    • There’s no way of changing the version other than removing the SQL Server types assembly from the GAC
  • All EF spatial values returned will internally rely on a SQL Server spatial value of the version chosen above
    • If a SQL Server spatial value of a different version is used it will be converted automatically

How SqlDataReader (part of SQLClient; not part of EF) works with spatial values:

  • SQL Server spatial types come in two versions
    • v10 corresponding to SQL Server 2008R2
    • v11 corresponding to SQL Server 2012
  • SqlDataReader will try to use v10 even if it’s not present
    • This is the cause of the second breaking change mentioned above, since it means that SQLClient will now throw if V10 types are not present
    • It also means that, if the types are present, EF will be doing lots of conversion from v10 types to v11 types

A fix for this would be to force SqlDataReader to use v11 types. There are a few options for this:

  1. Use assembly redirection
  2. On .NET 4.5 can specify the version in the connection string
  3. Use GetSqlBytes method instead of GetValue/GetValues

Options 1 and 2 require that app developers encounter the break and make changes to fix it. For option 1, the developer must setup assembly redirection from the v10 types to the v11 types. For option 2, the app developer must change their connection string, and this also will not work on .NET 4. Neither of these options are going to give a good experience, but we could go with them if nothing better can be found.

Option 3 reverts back to using the method that was previously used. This requires changes to the provider model because at the time we are reading from the store data reader we do not know whether or not a column is a spatial type or not. We would therefore need to be able to ask the provider for this information and then make an appropriate call to the provider to return the appropriate bytes.

Given then there are very few providers that support spatial (currently only SQL Server that we own, and one other that we know of) it does not seem unreasonable to investigate this option, so this is what we will do. This will hopefully avoid both breaking changes mentioned above and also help with the perf regression.

Last edited Feb 7, 2013 at 11:32 PM by ajcvickers, version 2