Native support for nested set operations

Topics: EF Designer, EF Runtime, General
Feb 7, 2013 at 11:14 AM
Edited Feb 7, 2013 at 11:15 AM
I know this is really an implementation detail and might not exactly be relevant for the runtime, but I've been working a lot on streamlining tree operations in sql server, and have begun to move away from the adjacency model which promotes recursive calls, e.g., this is an example product category structure:
| Id | Name | ParentId |
| 1  | Foo  | NULL     |
| 2  | Bar  | 1        |
With the adjacency model, if I need read a tree or sub-tree, I have to recursively select where ParentId = ?. The adjacency model works really well for maintaining trees, not so much for read operations.

With nested sets, we mark the extents of the sub-tree within the larger tree, e.g.:
| Id | Name | ParentId | Left | Right |
| 1  | Foo  | NULL     | 1    | 6     |
| 2  | Bar  | 1        | 2    | 3     |
| 2  | Ner  | 1        | 4    | 5     |
This is also know as Modified Preorder Tree Traversal (MPTT). This allows us to optimise read operations which are more likely to be far more frequent than write operations. The ParentId is there purely as a mechanism for simplifying some selects (e.g.. for direct descendent, or possible to rebuild the tree if required. E.g, if I want to get the sub-tree of Foo, I would do the following:
SELECT c.Id, c.Name, c.ParentId
FROM dbo.Category c,
(
    SELECT [Left], [Right] FROM dbo.Category p WHERE p.Id = <parent id here>
) AS p
WHERE c.[Left] >= p.[Left] AND c.[Right] <= p.[Right]
ORDER BY c.[Left], c.[Right]
Conversely, the operations to insert, delete and move items are a little trickier and we have to manage the [Left] and [Right] values accordingly, but the trade off I think personally is worth it, as we get the immediate benefits of a much quicker and performant read operation, and the translation to a tree structure can happen in code, rather than spread between code and database. We bypass recursion completely.

Not sure if this is just an pipe dream, but it EF provided my built-in support for designing a tree structure and storing it in Sql Server, generating the underlying table structure and operations to read/write the tree, I'd be in developer heaven!
Developer
Feb 11, 2013 at 5:57 PM
Hi AntarisZX,

Have you considered using SQL Server HierarchyID to store your tree data? EF doesn't yet support HierarchyID, but it is something that a lot of people are interested in and is something that either the team at Microsoft and/or the community may work on post-EF6.

With regard to your specific pattern, I'm not sure that there would be enough general interest for the team at Microsoft to work on this. However, we would potentially accept a contribution for it. It's worth keeping in mind that it would quite a complicated change to get EF to support this since it would likely require changes to many areas of the stack in order to support view generation, queries, materialization, updates, etc.

Thanks,
Arthur
May 31, 2013 at 12:54 PM
"Post EF6" - given that EF6 will final with visual studio vNext in 2-3 years.... "post EF" is something that is SO far out you can as well say "won't be supported".
Jun 4, 2013 at 6:27 AM
Talk about a bad post. Now VS 2013 is announced, so EF6 will be out this year ;)
Coordinator
Jun 4, 2013 at 6:51 AM
:)
Aug 5, 2013 at 1:59 AM
Edited Aug 5, 2013 at 2:03 AM
While I love HierarchyID and implemented a set of Table-Valued Functions wrapped in stored procedures and a Tree Repository C# library, I am a bit skeptical of the future of HierarchyID in EF for the following reasons.
  • HierarchyID is not a valid EF type. I had to do a view with HierarchyID.ToString() and a set of CRUD stored procedures to update the table. Awkward.
  • Tree manipulation operations requires a very specific, unintuitive and not well documented indexing. You shouldn't need to be a rocket scientist to do trees in windows :-)
  • HierarchyID type is limited to 892 bytes. Nodes with too many levels might not fit into 892 bytes and cannot be represented. As a result of that it's not a generic solution.
I'd love to see a more generic and usable Microsoft solution, but until then the Nested Sets Model of Hierarchies discussed in Joe Celko's Trees and Hierarchies in SQL for Smarties, might be a better solution.

Ivan