30
Vote

[Performance] Reduce start up time by loading finished Code First models from a persistent cache

description

Building and compiling large models using the Code First pipeline can be expensive in terms of start up time. Several pieces are coming together that could allow us to serialize the output of Code First (including the O-C mapping) into an XML artifact and then to deserialize it directly on subsequent runs.

Creating an efficient and completely reliable way of verifying that the serialized version of the model actually matches the Code First model doesn't seem feasible, but there are simple heuristics, e.g. checking for file timestamps of the assembly containing the model and the timestamp of the XML artifact that should work reasonably well in common scenarios.

Initial tests with a hacky prototype show that start up time of a model with 100 entities can do down from 8 seconds to 2 seconds using this approach.

file attachments

comments

rothdave wrote Nov 28, 2013 at 7:25 AM

I like this idea. In fact, before we migrated to EF, we used a similar approach in Nhibernate. On startup, we checked if the timestamp of the dll containing the models is bigger than the timestamp of the xml file. If so, we re-generated the xml. Otherwise we loaded the cached one.

Could you please share your prototype with us, so that I can try it out? (Doesn't matter if its hacky, I am just interested in the bits ...)

RoMiller wrote Dec 12, 2013 at 8:30 PM

EF Team Triage: Moving issues with Impact set to Low out of the 6.1.0 release as we only have time to address High and Medium issues in this release. We will re-triage these issues for future releases.

This does not exclude someone outside of the Microsoft EF team from contributing the change/fix in 6.1.0.

rothdave wrote Dec 16, 2013 at 8:12 AM

@RoMiller: How is the "impact type" defined in EF?
This feature would massively reduce startup-time (8 vs 2 seconds) for the model, so in my opinion this as a high-impact task.

However, as already asked in my previous post, I would be happy if you could share your initial prototype with the community. This would help contributors to understand how this feature could be implemented.

thanks.

ChrisMaeder wrote Dec 30, 2013 at 9:40 AM

This is impacting us fairly severely with our database implementation, so I feel that this should be categorized as a high-impact issue.
Thanks you

RoMiller wrote Dec 31, 2013 at 8:46 PM

Impact is a combination of a few things, but most notably how severe the issue is and what percentage of our customers it impacts. I'm clearing the Impact field on this one though as I agree it's probably not low.

rothdave wrote Jan 9 at 2:17 PM

Could you please share the branch with the prototype of this feature?

Duality wrote Feb 12 at 9:11 AM

Randomly, I made a repo for my EF Code First, my approach to building the dynamic contexts seems to save 0.5s over a normal code first context in my performance benchmarks... Until I realised that most of the difference was caused because my dynamic context contained:
    protected override void OnModelCreating(DbModelBuilder modelBuilder)
    {
        Database.SetInitializer<MyContext>(null);
    }
I wonder how many production implementations could start quicker if the initializer didn't run, maybe default it to null / none?
Anyway, sorry to wander off the track.

rothdave wrote Mar 18 at 9:05 AM

Now that EF 6.1 RTM is out, will this issue be considered for the next release?
Startup time is an important aspect in our application, and the numbers you posted here look very promising.

If not, could you please share the prototype with the community?

emilcicos wrote Mar 18 at 3:39 PM

The initial prototype, while proof of concept from the perf. point of view, was very rough because it was missing a lot of "plumbing". Not long ago I started improving it a bit, but I was side-tracked with other tasks. I will try to wrap-up what I have this weekend and share it.

rothdave wrote Mar 18 at 4:35 PM

Thanks for your feedback emilcicos! Great news that you are working on this subject!

emilcicos wrote Mar 25 at 2:28 PM

I attached a .zip containing a .patch file with the prototype and and a .cs file with sample code.
I only did some basic testing and it has not gone yet through our official review process, but if you find it useful I can iterate on it as needed.

rothdave wrote Mar 25 at 4:06 PM

Thank you so much emilcicos!! That is fantastic! I just applied your patch to current master and the performance boost is super awesome!

These are my results without an Ngend EF (not possible because i cannot strongly sign my custom build with your patch):

First-Query-Time:
  • Without DbModelStore: 4.5 seconds
  • With cached DBModelStore: 1.8 seconds
So i guess If I could Ngen this build, the startup time would be about 0.6 seconds (Ngen`d EF saves about ~ 1.2 seconds on my machine) which is very nice!

So I really hope your work will be in the EF-Alpha channels soon, so I can use it in combination with Ngen :)
Thanks :)

rothdave wrote Mar 27 at 11:14 AM

I have a question regarding the heuristics if the generated xml is valid. At the moment your solution does not check If the model has changed. This will lead to runtime errors.

One simple solution would be to check against the timestamp of the model assembly.
However, in the case of pregenerated views, EF can automatically determine the differences via hashing and transparently regenerates the view files if necessary.
Is something similar possible or would this eliminate the performance gains?

rothdave wrote Mar 31 at 6:24 AM

In case anyone is also using the cached db model store: here is my implementation of DefaultDbModelStore which invalidates/removes the cached xml file, if the last write time is prior to the last write time of the corresponding domain assembly:

https://gist.github.com/davidroth/9886349

Maybe this will be useful for some people.

rothdave wrote Apr 16 at 6:21 AM

@EfTeam/Emilcios: Could you please let me know if this feature will get into master branch any time soon?
This would be very nice, because then I could use NGEN again which is currently not possible with my self-build assembly. I also lost the advantage of using the ef alpha channels.

Would be happy if you could write me a quick reply so that I know if its worth to roll out a real self-signed assembly by myself.

thanks.

ajcvickers wrote Apr 16 at 4:12 PM

@rothdave This work item is currently in the "Future" release which means that we don't plan to include it in the next release of EF.