23

Closed

[Performance] Amount of associations in model affects materialization times

description

Analyzing bug 1781 it became apparent that a basic performance limitation of EF requires additional investigation. The time it takes EF to materialize data is a function of the amount of associations (foreign keys) the entire model has, even when such associations are not part of the query in question.

As an example, using the AdventureWorks database generate an EDMX model, then have the following query:
List<SalesOrderHeader> headers = null;
using (var ctx = new AdventureWorksEntities())
{
    headers = ctx.SalesOrderHeader.ToList();
}
This will materialize 31465 SalesOrderHeader entities. But the time it takes to do so depends on the size of the model we have. The test machine used has Windows 8.1 64-bit, .NET 4.5.1, high performance power setting, 12 GB RAM, Intel Xeon W3530 CPU at 2.8 GHz.

If the EDMX only has one entity type (SalesOrderHeader) materialization takes 840 milliseconds (median value, 10 runs), but when the EDMX contains a richer model, for example one with 67 entity types and 92 associations, the same test takes 7246 milliseconds to complete (median value, 10 runs).

The additional CPU is spent on code that’s affected by the amount of foreign keys in the model. The biggest difference seems to come from EntityEntry.TakeSnapshot() as it’s massively affected. It went from 173 samples to 29547 samples due to the cost of the call to TakeSnapshotOfForeignKeys(), according to Visual Studio 2013 sampling profiles.

The second biggest difference comes from ObjectStateManager.FixupReferencesByForeignKeys() which went from 5 samples to 20172 samples.

This requires additional investigation.

file attachments

Closed Feb 19, 2014 at 4:45 PM by mgirgin
See last comment

comments

jsmircic wrote Dec 13, 2013 at 7:34 PM

Hi,
One question. Is this regression from EF5 or the same behavior is there also? We still did not upgrade to EF6 and now i worry if the upgrade will get us slower queries until this is fixed?

thanks
jure

DavidObando wrote Dec 13, 2013 at 8:19 PM

This has always been a limitation in EF. It is not a new regression.

jonnybee wrote Dec 27, 2013 at 7:46 AM

Hi,
Ran the same test against AdventureWorks database and this issue is still not fixed in v 6.1.0-alpha1.

Equipment used:
· WS2008 with SqlServer 2008
· Win7 – Citrix Virtual machine as client
· Running RawBencher – compiled in Release mode without Debugger:
See also: https://github.com/FransBouma/RawDataAccessBencher for RawBencher

EF 6.0.1

1: Entity Framework v6.0.0.0 (v6.0.21010.0) : 4.370,00ms.

2: Entity Framework v6.0.0.0 (v6.0.21010.0) : 4.387,88ms.

EF 6.0.2:

1: Entity Framework v6.0.0.0 (v6.0.21211.0) : 4.712,38ms.

2: Entity Framework v6.0.0.0 (v6.0.21211.0) : 4.683,13ms.

EF 6.1.0-alpha1

1: Entity Framework v6.0.0.0 (v6.1.21218.0) : 4.524,00ms.

2: Entity Framework v6.0.0.0 (v6.1.21218.0) : 4.577,50ms.

So is this issue going to make it into the v6.1.x release?

divega wrote Dec 29, 2013 at 8:49 AM

Hi Jonny,

It is known that this issue is not addressed in 6.1 alpha 1. We are currently working on improving this aspect of performance, and yes, the plan is for those improvements we are working on to make it into the EF 6.1 release.

Thanks,
Diego

sporty81 wrote Jan 1, 2014 at 7:30 PM

For anyone who uses database first and has a lot of tables, this is a big performance problem. I would definitely test against a database with 100+ entities all in the same EDMX.

emilcicos wrote Jan 27, 2014 at 6:08 PM

Commit a640bcb005a06faf2894e9bd932eaf0e5318f4d5 improves perf by roughly 30-35% in the scenario described above. On my machine the 31465 SalesOrderHeader entities are materialized in about 2300ms (Mini-6) and 530ms (Mini-6-Single). A higher number in the Mini-6 vs the Mini-6-Single sample is expected because the entity type has 11 additional navigation properties.

RoMiller wrote Jan 30, 2014 at 6:30 PM

David to confirm improvement as part of verification.

DavidObando wrote Jan 31, 2014 at 12:38 AM

In my more modest machine the improvement seems a bit higher than reported by Emil.

I compared EF from early December 2013 versus EF from late January 2014 using the attached test. The difference in my development machine was 69% improvement in the case of the model with multiple associations. Here are the numbers:

EF early December 2013
Model with no associations: 839ms
Model with multiple associations: 5874.8ms

EF late January 2014
Model with no associations: 823.2ms
Model with multiple associations: 3479.7ms

There's not much of a difference in the model with no associations, as expected. But the difference in the model with multiple associations is very satisfying.

Fix verified.

divega wrote Jan 31, 2014 at 2:29 AM

** Closed by divega 01/30/2014 7:29PM

SolutionsDesign wrote Jan 31, 2014 at 10:54 AM

Is this fix available in the source repo here on codeplex already? Because if I build the source to release build (the latest master branch 35e16b80e6e5c0dd09d06ff80b46ecdf81434655 ) and run the benchmarks (https://github.com/FransBouma/RawDataAccessBencher) I see barely any improvement (3034ms for 6.0.2 and 2995ms for 6.1.0). Fetches done over network, not locally.

SolutionsDesign wrote Jan 31, 2014 at 11:20 AM

Tried the 6.1.0 branch (2ce7c84c22350b272e335c9c1d15ec0842c24471) Is indeed a different codebase as it gives an analyze error when building. Anyway, it gives a bit better results, but far from 30%. ->

See: http://pastebin.com/u3FiEtPn (EF v6.0.2, from NuGet)
See: http://pastebin.com/FpG8KET7 (EF v6.1.0 from source, release build, 2ce7c84c22350b272e335c9c1d15ec0842c24471)

So I can only conclude that the source here doesn't contain the fix?

divega wrote Feb 1, 2014 at 10:09 AM

Reactivating while we investigate the differences between the repro we looked at an Frans' current benchmarks,

divega wrote Feb 7, 2014 at 11:48 PM

We made a check-in today that increases the performance of scenarios similar to Frans' current benchmark (in our own in-house testing between 15% and 20%). Unfortunately this won't make it in time for 6.1 beta, but it is checked-in for the next milestone. We also came to the conclusion that we cannot make this part of materialization much faster without a more profound refactoring, something that would be too risky to do at this stage of the release.

SolutionsDesign wrote Feb 11, 2014 at 8:28 AM

Hi Diego,

The commit is this one I think: https://entityframework.codeplex.com/SourceControl/changeset/ec1f37400bb5ec30cf2e9a39382e035b78e914ac ? I'll try it out later today in our benchmark and will post the results (otherwise tomorrow :))

ErikEJ wrote Feb 11, 2014 at 9:17 AM

@Frans Just using the latest nightly build should be fine

SolutionsDesign wrote Feb 11, 2014 at 9:39 AM

@ErikEJ Will do, thanks :)

SolutionsDesign wrote Feb 11, 2014 at 1:28 PM

Haven't tried the new build yet, but I found something else in the current 6.0.2 code that might be interesting:

When I use code which has FK fields defined, the fetches are massively slower than when I don't have FK fields defined:

WITH Fk fields defined:

Entity Framework v6.0.0.0 (v6.0.21211.0). Change tracking: True. Caching: False.

Set fetches

Number of elements fetched: 31465. Fetch took: 6807ms. Enumerating result took: 3ms
Number of elements fetched: 31465. Fetch took: 6720ms. Enumerating result took: 3ms
Number of elements fetched: 31465. Fetch took: 6736ms. Enumerating result took: 3ms
Number of elements fetched: 31465. Fetch took: 6711ms. Enumerating result took: 3ms
Number of elements fetched: 31465. Fetch took: 6731ms. Enumerating result took: 3ms
Number of elements fetched: 31465. Fetch took: 6905ms. Enumerating result took: 4ms
Number of elements fetched: 31465. Fetch took: 6778ms. Enumerating result took: 3ms
Number of elements fetched: 31465. Fetch took: 6745ms. Enumerating result took: 3ms
Number of elements fetched: 31465. Fetch took: 6691ms. Enumerating result took: 3ms
Number of elements fetched: 31465. Fetch took: 6756ms. Enumerating result took: 3ms

WITHOUT Fk fields defined:

Entity Framework v6.0.0.0 (v6.0.21211.0). Change tracking: True. Caching: False.

Set fetches

Number of elements fetched: 31465. Fetch took: 3169ms. Enumerating result took: 4ms
Number of elements fetched: 31465. Fetch took: 2948ms. Enumerating result took: 3ms
Number of elements fetched: 31465. Fetch took: 3035ms. Enumerating result took: 3ms
Number of elements fetched: 31465. Fetch took: 2996ms. Enumerating result took: 3ms
Number of elements fetched: 31465. Fetch took: 2978ms. Enumerating result took: 3ms
Number of elements fetched: 31465. Fetch took: 2942ms. Enumerating result took: 3ms
Number of elements fetched: 31465. Fetch took: 3051ms. Enumerating result took: 3ms
Number of elements fetched: 31465. Fetch took: 3008ms. Enumerating result took: 3ms
Number of elements fetched: 31465. Fetch took: 2933ms. Enumerating result took: 3ms
Number of elements fetched: 31465. Fetch took: 2948ms. Enumerating result took: 3ms

that's half the speed. Will do the same thing with the 6.1 nightly build later today.

SolutionsDesign wrote Feb 11, 2014 at 2:07 PM

With 6.1 beta nightly build from the myget site:
With FK fields:

Entity Framework v6.0.0.0 (v6.1.30207.0). Change tracking: True. Caching: False.

Set fetches

Number of elements fetched: 31465. Fetch took: 4238ms. Enumerating result took: 3ms
Number of elements fetched: 31465. Fetch took: 4146ms. Enumerating result took: 3ms
Number of elements fetched: 31465. Fetch took: 4083ms. Enumerating result took: 3ms
Number of elements fetched: 31465. Fetch took: 4053ms. Enumerating result took: 3ms
Number of elements fetched: 31465. Fetch took: 4028ms. Enumerating result took: 4ms
Number of elements fetched: 31465. Fetch took: 4096ms. Enumerating result took: 3ms
Number of elements fetched: 31465. Fetch took: 4005ms. Enumerating result took: 3ms
Number of elements fetched: 31465. Fetch took: 4097ms. Enumerating result took: 3ms
Number of elements fetched: 31465. Fetch took: 4068ms. Enumerating result took: 3ms
Number of elements fetched: 31465. Fetch took: 4082ms. Enumerating result took: 3ms

Without FK fields:

Entity Framework v6.0.0.0 (v6.1.30207.0). Change tracking: True. Caching: False.

Set fetches

Number of elements fetched: 31465. Fetch took: 2981ms. Enumerating result took: 3ms
Number of elements fetched: 31465. Fetch took: 2829ms. Enumerating result took: 4ms
Number of elements fetched: 31465. Fetch took: 2822ms. Enumerating result took: 4ms
Number of elements fetched: 31465. Fetch took: 2836ms. Enumerating result took: 4ms
Number of elements fetched: 31465. Fetch took: 2833ms. Enumerating result took: 4ms
Number of elements fetched: 31465. Fetch took: 2828ms. Enumerating result took: 4ms
Number of elements fetched: 31465. Fetch took: 2836ms. Enumerating result took: 4ms
Number of elements fetched: 31465. Fetch took: 2844ms. Enumerating result took: 4ms
Number of elements fetched: 31465. Fetch took: 2835ms. Enumerating result took: 4ms
Number of elements fetched: 31465. Fetch took: 2812ms. Enumerating result took: 3ms

so performance has definitely improved a lot in the case with FK fields defined, however with no FK fields defined, the picture is a bit less optimistic.

HTH

SolutionsDesign wrote Feb 11, 2014 at 4:18 PM

@ErikEJ mentioned on twitter that David Obando already mentioned the difference re fk fields, I completely overlooked that. Sorry about that.

SolutionsDesign wrote Feb 11, 2014 at 4:24 PM

On second thought: David Obando mentiond a model without associations, but that's something else than not FK fields. The model I used does have associations, the one without FK fields is faster in that case than the one with FK fields.

SolutionsDesign wrote Feb 11, 2014 at 4:29 PM

So to clarify:
WITH FK fields means:
Order.Customer is there
Order.CustsomerId is there

WITHOUT FK fields means:
Order.Customer is there
Order.CustomerId is NOT there. (the developer has to use Order.Customer.CustomerId)

DavidObando wrote Feb 11, 2014 at 4:34 PM

Improvement made by Emil on Frans' test verified, it looks like Frans also verified it himself :)

@Frans: we do know of these limitations and will continue working to close the gap between models that have FKs versus models that don't.

emilcicos wrote Feb 11, 2014 at 6:03 PM

If I am reading the nightly build history correctly the change to improve the case without FK fields is not included in v6.1.30207.0. The first one that has it is v6.1.30208.0 (current nightly build is v6.1.30211.0). The improvement should be a bit more than 15% (not 5% as resulted from the numbers above), unless I did something wrong while measuring.

SolutionsDesign wrote Feb 11, 2014 at 6:36 PM

@emilcicos ah :) I used the nightly from the myget source, which seems 4 days old or so :) I posted the full results here: http://pastebin.com/AAqRhH4X (blog post here with nicer tables/info: http://weblogs.asp.net/fbouma/archive/2014/02/11/fetch-performance-of-various-net-orm-data-access-frameworks-part-2.aspx) so you can see how EF is doing against other frameworks.

Tomorrow I'll do a test with a build from source to see how they compare.

ErikEJ wrote Feb 11, 2014 at 7:09 PM

Frans, the myget feed on my pc is dated 0111 !?

ErikEJ wrote Feb 11, 2014 at 7:09 PM

Sorry, mean 0211

SolutionsDesign wrote Feb 11, 2014 at 7:41 PM

I see it's been updated 3 hours ago (https://www.myget.org/gallery/aspnetwebstacknightly), I pulled the build this afternoon, more than 6 hours ago, so I missed the update :) Will re-run tomorrow.

mgirgin wrote Feb 17, 2014 at 5:16 PM

@Frans were you able to re-run? :) I was wondering how it works for you now.

mgirgin wrote Feb 19, 2014 at 4:44 PM

Closing this one -- the improvements were verified for 6.1 release. We'll file separate bugs as necessary to ensure remaining issues are also tracked for future releases.

BBGONE wrote May 27, 2014 at 7:08 AM

I'v tried EF6.1. Imported all AdventureWorksLT2012 tables and views into the model.
Now this code
 using (var dbctx = new AdventureDAL.AdventureWorksLT2012Entities(AdventureDAL.EntityConnectionFactory.GetADWConnection()))
                {
                    var cnt = dbctx.Customer.Count();
                    var customers = dbctx.Customer.OrderBy(c => c.LastName).ThenBy(c => c.MiddleName).ThenBy(c => c.FirstName).ThenBy(c => c.MiddleName);
                    AdventureDAL.Customer currentCustomer = null;
                    for (int i = 0; i < cnt; ++i)
                    {
                        currentCustomer = customers.Skip(i).Take(1).FirstOrDefault();
                        sb.AppendFormat("{0} {1} {2}<br/>", currentCustomer.LastName, currentCustomer.MiddleName, currentCustomer.FirstName);
                    }
                    return sb.ToString();
                }

takes around 7 seconds to execute.
EF6.1 is unusable

divega wrote May 27, 2014 at 5:34 PM

@BBGONE: Have you tried these steps to reduce start up time with NGEN: http://msdn.microsoft.com/en-US/data/dn582034?

SickDotNet wrote Jun 7, 2014 at 3:50 PM

Haven't tried AdventureWorks, but starting a clean new project has the same problem with the latest NuGet package of EF6.1. In short, take a look at the following:
    public IEnumerable<Customer> List()
    {
        var returnValue = new List<Customer>();

        var stopwatch = new Stopwatch();
        stopwatch.Start();

        using (var context = new EFDataContext())
        {
            var list = context.Customers
                .Where(c => c.TenantId == TenantId)
                .ToList();

            returnValue.AddRange(list);
        }

        stopwatch.Stop();
        Debug.WriteLine("Execution took (ms): " + stopwatch.ElapsedMilliseconds);

        return returnValue as TResponse;
    }
Using database/model first, this takes approx. +1.600ms to execute
Using codefirst, this takes approx. +3.100ms to execute
Using plain old sql (SqlCommand), this takes approx. 70ms to execute

The latter just to state that there's nothing wrong with the I/O times on the system. The measured values are a medion after about 5 runs, in debug mode on Visual Studio 2013 Update 2 and initialized through a unit test.

Some other facts:
  • This query results in 4 records to be returned
  • Then entire table has a whopping 5 records in total
  • There are two foreign keys on the table
  • Database set on SQL Server 2012 Express
Seems a really, REALLY serious issue here, people...

DavidObando wrote Jun 11, 2014 at 8:29 PM

Hi SickDotNet.

We know cold start is not fast. Model initialization and validation require time to complete before the actual queries are executed. This process, however, happens only once per app domain. Once the model is "warm" then queries are leaner.

Executing a unit test in debug mode is not ideal if you're trying to measure the performance of a system. We recommend writing performance tests that resemble more what your production environment will look like, use release bits, tune the power settings of your machine to "high performance". In order to get faster performance also ngen EntityFramework.dll and run in a controlled environment where the performance test is not competing for critical resources. We also recommend measuring different dimensions such as the cost of cold versus warm queries and the cost of model building.

We noticed that in your case there's a large cost associated to using Code First instead of an EDMX. This additional cost comes from the model building process by which EF takes your code and builds the corresponding model in memory (the equivalent of an EDMX), and then continues validating and using the model in the same way as models with an EDMX file. Depending on the size of your Code First mode, the process of model building might be lengthy. The only way to reduce this cost is by having an EDMX file that EF can use to load the model instead of inferring it.

We are assuming your model is mid to large sized by the time of your code fist path. Can you confirm this is the case? Also, can share with us a profile of the execution of this test in order to understand where it is that EF is spending the time working?

Let us know if you have additional questions.

David.

sgissinger wrote Aug 22, 2014 at 10:25 PM

Hi,
I used to use EF4, I never used EF5 and began to migrate from VS2010 to VS2013 in order to use more modern technologies.

I made 2 little Code First models :
  • one with 3 base abstract classes inherited by 12 (4x3) table-classes with no FK
  • one with 5 table-classes with only 4 FKs
Whatever the model it always use about 8 seconds to start from cold query. I won about 1 second with pre-generated views and another second with NGEN.
My computer is only about 4 years old and my customer are able to use older computers.

Fortunately, I use them with a little database used in a WinForms application and I can keep a reference during the whole application lifetime without having memory issues.

Unfortunately, this application needs to apply changes to about 150 differents databases using the same schema (model 1).
Disposing and re-creating a DbContext foreach database is very very costly.
I'll try to avoid using() { } statements, but I'm afraid that I'll need to rewrite everything with the good old SqlConnection, SqlCommand objects.

Maybe you may allow us to create EDMX file at runtime, save it as a file and reuse for later DbContext use. And adding some kind of EDMX updater with Code First migrations.

As a conclusion I'm disappointed by EF 6.1.1 performances but it's usage is simpler and easier genericable than EF4.0 and its ObjectContect.
I prefer to use MS products but this performance issue is very huge for such little models and I will certainly give a try to NHibernate.

Sebastien.

divega wrote Aug 23, 2014 at 1:39 PM

Hi Sebastien,

Could you create a new workitem and include a repro? What you are describing seems completely unrelated to #1829, plus we would like to understand what is making your models so slow and also if you are following the necessary steps to cache the model correctly for each separate database schema.

Thanks,
Diego

DavidObando wrote Aug 25, 2014 at 3:54 AM

Hi Sebastien,

I second Diego: we'd like to hear more from you. If you could create a bug describing the models you are using, or better yet, attaching a repro of the bug for our analysis, that would be very helpful.

Additionally, you can generate the EDMX from the code first project.

C# EdmxWriter.WriteEdmx(context, new XmlTextWriter(new StreamWriter("MyModel.edmx")));

Call it in your context instance and you’ll get an EDMX file with all the data you require. Add the EDMX to your project and modify your connection string accordingly to make use of the EDMX data.

David.