Design Meeting Notes - February 7, 2013

DbContext and transactions

Continuing the discussion from last week where the following pattern was suggested:

using (var context = new MyContext())
{
    using (var transaction = context.Database.BeginTransaction())
    {
        // Do some stuff
        transaction.Commit();
    }
}

Part of the niceness here is that the connection opening is managed. That is, if the connection is not open, then it is opened as part of the BeginTransaction call. However, if BeginTransaction did open the connection, then it should also close the connection when its using scope ends—i.e. when the object it returns is disposed. This leads to the question of what type is the object that is returned? Options:

  • It could be the raw, store DbTransaction object returned from DbConnection.BeginTransaction. However there is no way to be notified when this object is disposed, so there would be no way to close the connection. We could just not close the connection, but this seems both “wrong” and could result in connections being open for a long time if the context doesn’t own the connection or if the context is not disposed in short order.
  • It could be a new type that extends from DbTransaction and that wraps the existing DbTransaction—Decorator pattern. The problem here is that app may need the underlying store DbTransaction in order to pass it to things like the DbCommand.Transaction property. This means that there should be a way to get from the type we return to the underlying store transaction. We would probably do this by adding a property to the new type, making the type public, and having BeginTransaction explicitly return our new type.
  • We could follow the pattern described in the previous bullet, but make the new type not inherit from DbTransaction. Instead it would be an explicitly new type concerned with managing the transaction and connection together. It would have a property for the store transaction and methods for commit and rollback, but probably nothing else.

We will likely go with the third option, although it has been hard to come up with a good name for the type and so this remains an open question. The shape of the class will be something like this:

public class DbNoNameYet : IDisposable
{
    internal DbNoNameYet() {}
    public void Commit()
    public void Rollback()
    public void Dispose()
    public DbTransaction Transaction { get; }
}

Some common usages:

using (var context = new MyContext())
{
    // Use the provider's default isolation level
    using (var transaction = context.Database.BeginTransaction())
    {
        // Do some stuff
        transaction.Commit();
    }
}

using (var context = new MyContext())
{
    // Use a specific isolation level
    using (var transaction = context.Database.BeginTransaction(IsolationLevel.Chaos))
    {
        // Do some stuff
        transaction.Commit();
    }
}

using (var context = new MyContext())
{
    using (var transaction = context.Database.BeginTransaction())
    {
        // Optionally do some stuff with context
        
        // Do some raw stuff on a command
        var myCommand = context.Database.Connection.CreateCommand();
        myCommand.Transaction = transaction.Transaction;
        myCommand.CommandText = "Foo";
        myCommand.ExecuteNonQuery();

        // Optionally do some more stuff with context

        transaction.Commit();
    }
}

Note that when the type returned from BeginTransaction is disposed we can do the equivalent of UseTransaction(null) such that EF will know that for future operations on that context it needs to create and use a new transaction. (Note that UseTransaction is the new name for the SetTransaction method that was mentioned in notes from the last meeting.) For example:

using (var context = new MyContext())
{
    using (var transaction = context.Database.BeginTransaction())
    {
        // Do some stuff
        transaction.Commit();
    }

    // Do some more stuff
    context.SaveChanges(); // New transaction used
}

We also considered whether or not connection opening could be done lazily only when the transaction is actually needed? This adds complexity and may be unexpected by somebody accessing the connection, so it probably should be opened immediately.

Pull request: extending Migrations

There was a brief discussion about his pull request: http://entityframework.codeplex.com/SourceControl/network/forks/iceclow/efmigrationextensions/contribution/4023

The concepts in the pull request seem basically sound, but adding AddOperation to the public surface seems inappropriate given how it and the API are used. It feels like it should remain internal. We could instead have an interface that is explicitly implemented so that the normal API surface is not polluted, possibly together with extension methods. Also, it would be good if the dispatch could be done using dynamic instead of Reflection. This will be investigated a bit more and a response will be made to the contributor.

Issues with SQL Server spatial types

The change to use buffering (BufferedDataReader) causes some issues with spatial types:

  • Perf regressions when spatial types are included in the query. The root cause of these regressions are not fully understood, but part of it is due to not using GetSqlBytes (see below.)
  • A breaking change at the provider level whereby we now pass a BufferedDataReader instead of the store data reader. If the provider casts to the store data reader type then this will now fail.
  • A breaking change due to the way SQL Server spatial types are handled. See below.

Background on how SQL Server spatial types interact with EF. (Yes, this is ugly.)

How EF spatial types work:

  • All methods delegate to the DbSpatialServices implementation
  • The DbSpatialServices is resolved lazily when the first instance is instantiated
    • Uses GetService<DbSpatialServices>() if not null
    • Else if there's a SQL spatial assembly that can be loaded with Assembly.Load uses the SqlSpatialServices singleton
    • Else uses DefaultSpatialServices

How SqlSpatialServices works:

  • Lazily finds the latest version of the SQL Server types assembly and loads it
    • There’s no way of changing the version other than removing the SQL Server types assembly from the GAC
  • All EF spatial values returned will internally rely on a SQL Server spatial value of the version chosen above
    • If a SQL Server spatial value of a different version is used it will be converted automatically

How SqlDataReader (part of SQLClient; not part of EF) works with spatial values:

  • SQL Server spatial types come in two versions
    • v10 corresponding to SQL Server 2008R2
    • v11 corresponding to SQL Server 2012
  • SqlDataReader will try to use v10 even if it’s not present
    • This is the cause of the second breaking change mentioned above, since it means that SQLClient will now throw if V10 types are not present
    • It also means that, if the types are present, EF will be doing lots of conversion from v10 types to v11 types

A fix for this would be to force SqlDataReader to use v11 types. There are a few options for this:

  1. Use assembly redirection
  2. On .NET 4.5 can specify the version in the connection string
  3. Use GetSqlBytes method instead of GetValue/GetValues

Options 1 and 2 require that app developers encounter the break and make changes to fix it. For option 1, the developer must setup assembly redirection from the v10 types to the v11 types. For option 2, the app developer must change their connection string, and this also will not work on .NET 4. Neither of these options are going to give a good experience, but we could go with them if nothing better can be found.

Option 3 reverts back to using the method that was previously used. This requires changes to the provider model because at the time we are reading from the store data reader we do not know whether or not a column is a spatial type or not. We would therefore need to be able to ask the provider for this information and then make an appropriate call to the provider to return the appropriate bytes.

Given then there are very few providers that support spatial (currently only SQL Server that we own, and one other that we know of) it does not seem unreasonable to investigate this option, so this is what we will do. This will hopefully avoid both breaking changes mentioned above and also help with the perf regression.

Last edited Feb 8, 2013 at 12:32 AM by ajcvickers, version 2

Comments

JamesHiggins Feb 15, 2013 at 4:46 PM 
Thanks for the reply Arthur.

I'm a software architect, so I tend to get excited about software designs. I was surprised and quite happy to see this kind of internal design detail posted publicly for a Microsoft product! Would have been difficult not to comment since ideas popped into my head. ;)

Interfaces are a really nice tool and allow for a lot more flexibility than classes. Coming from Delphi I was always a bit sad that the VCL was created before interfaces were added and thus rarely used them. I've always found it strange when the .NET FCLs don't use them more and, in particular, the IDb* ones that exist but seem to be unused. Oh well, figured it would be preferable to use an existing interface that matched your goals but that isn't always possible.

Great job of EF by the way. I've been using ORM frameworks for years and even wrote my own in the past. EF still has some room for improvement (what doesn't?) but its excellent and makes development much easier and more productive.

ajcvickers Feb 15, 2013 at 1:23 AM 
Hi James,

Thanks for taking the time to comment on these notes. With regard to using IDbTransaction, we avoid using this interface (and other similar interfaces) because of the way the .NET Framework uses these interfaces, or, I should say, doesn’t use these interfaces. That is to say that most .NET APIs accept and return instances of the concrete type rather than the interface, and this makes it a painful and often futile experience to try to program against the interfaces. There is also another related but internal reason why we can’t create new API that uses these interfaces, but unfortunately I can’t talk publicly about that reason.

With regard to using the resolver to obtain a transaction, this is an interesting idea. I will bring it up with some others on the team and see what they think.

With regard to passing the transaction to the constructor, this is certainly a nice pattern…except for the fact that DbContext already has a lot of constructors and the choice already causes a certain amount of confusion. We don’t really want to add to the concept count or constructor count here. You should still be able to use the pattern in your example by defining the constructor on your class appropriately. For example:

public class MyContext : DbContext
{
public MyContext(DbTransaction transaction)
: base(transaction.Connection)
{
this.Database.UseTransaction(transaction);
}
}

Thanks,
Arthur

JamesHiggins Feb 14, 2013 at 8:47 PM 
Darn, sorry the code didn't show up properly. I haven't posted on CodePlex much and reading in "Get Help" I saw that {code:c#} should have resulted in proper formatting. Apparently that does not apply to comments.

JamesHiggins Feb 14, 2013 at 8:45 PM 
Its great that you're posting design meeting notes publicly. I had a couple of thoughts regarding the transaction discussion from 1/31 and 2/7.

Regarding what class to return and what to name it, how about returning the already existing IDbTransaction interface? Its signature matches and its aptly named, plus using an interface instead of a class enables flexibility.


"Note that there are cases with these where EF could attempt to use the transaction after it has been disposed."
If this is a concern IDbTransaction could be expanded. No good suggestion on name, IDbConnectionTransaction is a possibility or the more generic IDbTransaction2 similar to how Direct-X handles names. The implementation would update State when Rollback or Commit was called.

{code:c#}
public enum DbTransactionState { Pending, Committed, Rolledback }
public interface IDbTransaction2 : IDbTransaction
{
DbTransactionState State { get; }
}
{code:c#}


Obtaining transactions could also be implemented as a service using your new IDbDependencyResolver. Pass in IDbTransaction as the type and the EF context needing the transaction as the key. The root resolver would mirror existing EF functionality and users could add their own resolver if they needed customized transaction handling. For example a non-standard isolation level could be applied that would then be used automatically by EF whenever it started a transaction. By checking the

{code:c#}
public class MyTransactionResolver : IDbDependencyResolver
{
public object GetService(Type type, object key)
{
if (type == typeof(IDbConnectionTransaction))
{
if (key is MyEntities)
return (key as ObjectContext).Connection.BeginTransaction(IsolationLevel.Snapshot));
if (key is MyOtherEntities)
return (key as ObjectContext).Connection.BeginTransaction(IsolationLevel.Chaos));

// Allows root resolver to create a standard transaction for other context types
}
}
}
{code:c#}


Personally I'd like the option of passing the transaction into the context constructor as an overload instead of the connection. The Connection property of IDbTransaction specifies the connection, so if a transaction is provided the connection can be omitted.

{code:c#}
using (var connection = new SqlConnection(connectionString))
{
connection.Open();

using (var transaction = connection.BeginTransaction())
{
// Do some stuff with connection and transaction

using (var context = new MyContext(transaction))
{
// Do EF stuff--the transaction should be used
}

using (var context = new MyContext(transaction))
{
// Do EF stuff--the transaction should be used
}
}
}
{code:c#}