Design Meeting Notes - December 6, 2012
Additional ObjectContext and EntityConnection constructors
An EntityConnection can be constructed using an already existing store DbConnection. In this situation when the EntityConnection is disposed it will not dispose the store connection. The same things applies when an ObjectContext is constructed with an existing
EntityConnection. In many cases this is the desired behavior, but it causes problems when the EntityConnection/ObjectContext is handed off to other code which may not know that the underlying DbConnection/EntityConnection needs to be disposed.
A potential contributor has proposed adding new constructors to EntityConnection and ObjectContext that let instances of these classes know that even though the DbConnection/EntityConnection
was created by the caller it should still be disposed when the EntityConnection/ObjectContext is disposed:
public ObjectContext(EntityConnection connection, bool contextOwnsConnection)
public EntityConnection(MetadataWorkspace workspace, DbConnection connection, bool entityConnectionOwnsStoreConnection)
This was discussed in the design meeting and we decided that we would be happy to accept such a contribution.
Default schema changes during initialization
Out-of-the-box, Code First creates all database objects in the “dbo” schema. This can be changed using the HasDefaultSchema method of the model builder. When using Migrations the transition from one default schema to another can be handled by an explicit
When using database initializers without Migrations things get more difficult. Currently, Code First recognizes the case where HasDefaultSchema is used to move from “dbo” to something new, and will inform users how to move from “dbo” to a new schema. Code
First can do this because “dbo” is a well-known out-of-the-box default.
However, Code First without Migrations does not detect when moving from one default schema that is not “dbo” to another default schema. This is because Code First has no way of knowing the previous default schema and therefore no way of finding the MigrationHistory
We discussed ways that Code First might be able to scan all schemas to try to find potential old default schemas, but this seems too fragile in terms of both false positives and false negatives. There were no other viable ideas for detecting the old default
schema, so the options became:
- Prevent use of HasDefaultSchema when not using Migrations. This would disable useful scenarios that currently work, such as simple multi-tenancy, so we don’t want to do this.
- Make the “dbo” case consistent with the non-“dbo” case. This would make the behavior consistent, but we believe that the majority case for moving from “dbo” to something else is in early development where the current behavior is desirable.
For the non-“dbo” case, it may often be useful to be able to change default schema and have it create new databases automatically without throwing or thinking that user might be migrating from one schema to the other. This is the simple multi-tenancy scenario.
Therefore, we decided to keep the behavior as is.
Making EF queries buffer by default
Buffering breaking change
The current plan is to switch EF queries to buffer rather than stream by default. This is required for connection resiliency, as was previously discussed, but is also beneficial in other situations for a number of reasons, the main ones being:
- The connection is potentially open and in use for a shorter period of time
- Multiple Active Result Sets (MARS) would not be needed for nested queries, such as those that are common when lazy loading
Therefore, since we will now support buffering, we plan to make it the default even when not using connection resiliency. There will be an extension method on IQueryable called “AsStreaming” that will switch streaming back on. The AsStreaming method follows
the same pattern as the current AsNoTracking method.
Buffering by default is a breaking change for the following reasons, all of which can be fixed by using the AsStreaming method:
- Exceptions that might previously have come only while using the iterator (i.e. MoveNext) will now come earlier—probably from GetEnumerator, but maybe from the first call to MoveNext. We don’t believe that this is a big enough issue to block the breaking
change because it is very unlikely that applications will have exception handling that will behave differently in this situation.
- Applications that intentionally start iterating over results but break out of that iteration before finishing will behave differently because all results will now be obtained from the database when previously some may not have been obtained. In such a situation
an exception that might not have been encountered because the app never got to the data that causes the exception may now be encountered. In such a situation the application is relying on the fact that it will break out of a loop before getting to exceptional
data, which is a risky strategy and we don’t believe it will be common.
- Applications may run out of memory where they previously would not have done so. This will not be common because for any tracking query that runs to completion all the materialized objects must fit into the state manager cache even when streaming. However,
it may happen in less common scenarios:
- When an application intentionally breaks out of result iteration before finishing. In most cases this won’t cause a problem, but there is a small chance it could.
- When using large no tracking queries or projections where the resulting objects are not tracked and where references to each object are released as iteration is continuing. In such a situation AsStreaming may need to be used. This is probably the most likely
real break the change could cause. We could make NoTracking queries continue to stream because of this, but that adds concept count and reduces consistency, so we are currently choosing not to.
- When memory is very tight such that any extra overhead causing by the buffering becomes significant. We don’t believe this will be common.
- Functional responsiveness may be different since the call to GetEnumerator may take longer even though the following iteration will be faster. We don’t believe this will be a big problem.
We believe that the benefits of buffering for the vast majority of EF applications outweigh the relatively small chance of breaking existing applications and so we plan to make the change.
Micro-decisions about the AsStreaming support:
- Add method to DbQuery, SqlQuery, etc. just like AsNoTracking
- Add IQueryable extension method just like AsNoTracking:
- Method will be implemented to work for ObjectQuery and DbQuery
- Method will also look for “AsStreaming” method on user-implemented IQueryables, just like AsNoTracking
- Add flag to ObjectQuery for simplicity of implementation and consistency with the way NoTracking works on ObjectQuery
- For ObjectContext.ExecuteStoreQuery and similar methods we will create an overload that takes an options object to avoid having to keep adding more and more overloads to these methods in the future.
SQL Azure connection resiliency
Execution strategy key
The current plan provides a service that is resolved by the Service Locator with default execution strategies provided. The user can set different strategies to be resolved by the locator, but what should the key be that is used to find the strategy to use?
- The provider invariant name on its own may not be sufficient because resiliency may be desired for SQL Azure but not for a local SQL Server instance, and these both use the same provider
- However, the provider should be able to provide a default, so if the specific resolution described before does not return a service then it should instead be resolved with just the provider invariant name
- The context might be correlated to the strategy to use in some cases, but this will not always be the case and there is no context available in several places where we want to use the service
- The connection string or DbConnection could be used in addition to the provider invariant name, but these are harder to handle than simple strings that can be easily matched
- The provider invariant name and server name seems to cover most cases we can think of and are simple strings that are easy to handle in a resolver and when performing registration in DbConfiguration
We will therefore resolve once with the provider invariant name and server name and if no service is returned we will resolve again with just the provider invariant name.
Executing multiple methods in one transaction
There’s still the question on how to disable the execution strategy used internally in EF methods so that they could be executed in a single transaction.
- We could provide an overload on each to disable the execution strategy. However this would result in an explosion of overloaded methods, which we want to avoid.
- We could detect whether the method is being called from within an ExecutionStrategy.Execute call by setting a flag in the CallContext, but it’s not supported in PCL.
- It might not be good design to encourage this scenario at all. Since the ObjectContext is not transactional it is easy to forget that methods like SaveChanges mutate the ObjectStateManager and this won’t be rolled back if the next operation in the transaction
We’ll avoid adding a way of disable the execution strategy for a particular method call for now. It’s still possible to set the default execution strategy to one that doesn’t retry and use your own instance to wrap the EF methods in advanced scenarios.
WET, DAMP, or DRY tests
We try to write tests that are very explicit and self-contained even though that often means code is duplicated between tests. This makes it easier to see what is being tested and what the expected results are, which in turn makes it easier to see situations
where a test passes but passing doesn’t really make sense or is a bad experience for the given scenario. It also makes debugging much easier.
The discussion here is not about changing this approach, but rather about the degree to which using helper methods and parameterized tests detract from the benefits of having WET tests. The conclusions were:
- Helper methods that do common things, such as setting up mocks, that are the same for many tests are acceptable so long as they remain simple and easy to understand in the context of each test.
- Likewise, parameterized tests are acceptable so long as the parameterization only trivially alters the execution and outcome of the test. For example, having one test body that executes the same test on several derived types by passing in each derived type
as a parameter is acceptable.
- In such situations a descriptive test name should still be used for the parameterized test.
The take home message is to make sure tests are WET enough that each test is easy to understand and debug. Be DRY only when doing so does not detract from this goal.