Jul 23, 2012 at 1:45 AM
Edited Jul 23, 2012 at 1:46 AM
@tamasflamich: Thanks for the great insights! This is an area in which we have been forced to make some compromises, so we are actively seeking feedback to validate that the compromises we picked are all right.
As usual, we expect to have to iterate over the design several times until we get most things right :)
Let me try to answer your questions at the same time I give you some context on how we arrived to the current design...
First of all, we are definitively trying to avoid causing fragmentation in the ecosystem. We are hopeful that the EF interfaces are hidden enough from the experience (e.g. if you are writing application
code you should never really need to cast to any of the interfaces, but just call ForEachAsync, ToListAsync, FirstAsync, etc.) that if a clear standard for async collections emerges then we will be able to make the switch to that. By making
the interfaces EF specific we are also opting out from having the EF become such standard for now (as there are other teams that are in a better position to produce that).
I completely agree that one of the advantages of LINQ is how it abstracts the underlying query providers. We in fact talked a lot to the Languages, TPL and Rx teams to try to define together a set of common async
collection abstractions we could share. As you mention it would have been nice for us if such abstractions existed in the BCL, but the reality is that they don't exist and it is not certain that they will be added. We looked too at the IAsyncEnumerable
interfaces as defined by Rx but in the end we came to the conclusion that it was more appropriate to create a solution that was local to EF. That way we can:
- Focus on getting the experience right for application developers
- Keep the number of concepts developers need to learn to a minimum
- Retain the ability to interop with the Rx interfaces through some kind of adapter
- Enable framework-like code to do more advanced things by casting to our public interfaces
At some point we thought about having something like AsAsync(). Besides the strong advice of the Rx team against it, it felt at the time like a somewhat unnecessary abstraction:
In LINQ there is currently no strict separation between query construction and query execution, i.e. you usually perform most query construction using IQueryable<T> operators but if you happen to use one of
the “immediate” query operators then the query executes immediately. We asked ourselves the question: Could we somehow draw a more strict line between query construction and execution so that we could reuse all the existing LINQ operators for query
construction but defer to the execution phase the decision on whether we want to process the query synchronously or asynchronously?
Based on this idea, our primary design for execution of immediate queries is a method that you won’t still find in the code base:
The method can be used like this:
var customer = await db.QueryAsync(() => db.Customers.First(c => c.Id == id));
You may notice that this expression is using the regular First operator. It works because the call to First is in this case being captured as part of an expression and not really being executed. Expression<Func<T>>
is indeed an apt representation of a deferred query that returns a single element, which is what we were looking for!
The plan is to add this method at a later point. Once we do so, it will allow any LINQ to Entities-recognized expression to be processed asynchronously on the server, e.g. it will be even possible to execute
a query that isn’t started with a IQueryable<T>:
var distance = await db.QueryAsync(() => spatialPoint1.Distance(spatialPoint2));
Since this is a generally useful thing to do, we are planning to add a synchronous version of the same method as well.
Async immediate LINQ operators
Initially we were going to only add ForEachAsync, ToListAsync and QueryAsync for asynchronous queries, but later we ended up deciding to add the async version of other immediate LINQ operators as a compromise justified
mostly by method discoverability, so you can do now something like this:
var customer = await db.Customers.FirstAsync(c => c.Id == id));
In my opinion these methods are a bit weird, but they are very convenient. In any case, we are seeking feedback on these (both their existence and their implementation). A few details:
- These methods are purposely defined on a sponsor class on our own namespaces so that they don’t accidentally “pollute” other IQueryable<T> queries unless you have imported System.Data.Entity.
- In their current implementation rather than creating an expression with a call to themselves as most immediate LINQ operators do, they will compose an expression with a method call to the non-async version and then
invoke the IDbAsyncQueryProvider.ExecuteAsync method.
- They throw if the IQueryaProvider cannot be caseted to IDbAsyncQueryProvider.
My main concern with these methods is that they may get in the way of testability, e.g. for developers used to take advantage of the abstraction of IQueryable<T>, implementing an additional interface in fakes
and mocks can turn out to be an additional burden. But this is something I haven’t tried myself so it might turn out not to be a big deal. The implementation of the methods also feels a little weird as well mainly due to #2.
We looked at several other options as well. I will need to spend some more time explaining them if you are interested :)