# Monday, December 01, 2008
Few days ago I was trying to put some Guids into the ItemCollection of a fresh, new WorkItem - I used that pattern earlier and never noticed anything strange going on. But this time was a totally different... What I did was something like this:
Guid a = Guid.Empty;
Guid b = Guid.Empty;

WorkItem.Items.Add(a, "A");
WorkItem.Items.Add(b, "B");
so since this point of development I didn't had the needed Values so I filled empty guids to test other stuff before implementing this. So at another point I tried to get the values out of the collection by its keys "A" and "B" (well I had real names for it but lets keep it simple)
Guid a = (Guid)WorkItem.Items["A"];
Guid b = (Guid)WorkItem.Items["B"];
unfortunally this threw an exception sometimes because null was returned... Yes - sometimes - not always... Nice... As I took a closer look it turned out that only the second line might crash and that the chance that this happens was growing with the elapsed time since I put the Guids in... so at this point it seemed a bit like radioactive decay... ok you'll guess that it was not only time - so I could speed up the decay by doing some operations in my window... I suspected it had something to do with the garbage collection since it is done asynchronously and may occur more often the more operations allocate memory. Besides there is no way to put any null in the WorkItem.Items-Collection which means that no other side effect should be abled to have Item B modified.
I played a bit around and found out, that always the last inserted item might disappear while the other one was stable. I also figured out that different values made both items stable so I asked myself if there might be any selfcleaning process in the CAB that would remove duplicates sometimes but what would that make sense for since the items where added with an explicit ID ("A" & "B").

Ok the solve of the mystery was quite a bit of both... so a little session with the .NET Reflector later I think I found what explained this. It turned out that even when an explicit ID was given the collection was searched for an Equal value. If it found one it did not add the Item to the list instead it used a method called BuildRepeatedItem and this -some levels deeper- (through the ReadWriteLocator...) led into not adding it to any list but into a Dictionary<TKey, WeakReference> and like its written on a Microsoft page:

"WeakReference Class - Represents a weak reference, which references an object while still allowing that object to be reclaimed by garbage collection."

nice... so whats the error here anyway? I don't currently see why a WeakReference should be applied to something that comes from the "API-User" and could be anything. On the other hand it does also not make much sense to compare the values if an ID is given. Ok maybe it was never intended to add simple value types into the workItem.Items-Collection but I think the problem would never occur if the Contains-Check in the Collection would have been done in a ReferenceEqual-comparison way instead of an Equality-comparison which leads to different results with value and reference types.

posted on Monday, December 01, 2008 5:03:02 PM (W. Europe Standard Time, UTC+01:00)  #    Comments [0]
# Tuesday, October 07, 2008
Recently I've been evaluating usage of Entity Framework instead of Linq to SQL. My first steps took me into comparison of traversing object relations. I made up an example which will demostrate how my attempts went with linq first and then in contrast will do the same with entity framework...


so this is what the linq-Designer had created of the dummy-database... I then used this to enumerate all Persons and maybe their pets and their cars in a listBox...
 listBox.Items.Clear();
 using (var ctx = new EntitiesDataContext())
 {
     var dlo = new DataLoadOptions();
     dlo.LoadWith<Person>(p => p.Pets);
     dlo.LoadWith<Person>(p => p.Cars);
     ctx.LoadOptions = dlo;

     foreach (var person in ctx.Persons.OrderBy(k => k.Firstname))
     {
         listBox.Items.Add(String.Format("Person {0}, {1}", person.Lastname, person.Firstname));

         foreach (var pet in person.Pets)
             listBox.Items.Add(String.Format("  Has Pet {0} called {1}", pet.Species, pet.Petname));

         foreach (var car in person.Cars)
             listBox.Items.Add(String.Format("  Has a {0} {1} with sign {2}", car.Brand, car.Type, 
car.Sign)); } }
notice that this wont work the way its meant to be by using LoadWith with two 1..n relations to person because it might lead into very big and very redundant resultsets and linq wont allow it... so what linq does instead is to select Person and Car und then select every Pet when its enumerated. So this is the produced SQL-Trace:
SELECT [t0].[PersonId], [t0].[Firstname], [t0].[Lastname], [t1].[CarId], 
[t1].[PersonId] AS [PersonId2], [t1].[Brand], [t1].[Type], [t1].[Sign], ( SELECT COUNT(*) FROM [dbo].[Car] AS [t2] WHERE [t2].[PersonId] = [t0].[PersonId] ) AS [value] FROM [dbo].[Person] AS [t0] LEFT OUTER JOIN [dbo].[Car] AS [t1] ON [t1].[PersonId] = [t0].[PersonId] ORDER BY [t0].[Firstname], [t0].[PersonId], [t1].[CarId] exec sp_executesql N'SELECT [t0].[PetId], [t0].[PersonId], [t0].[Species], [t0].[Petname] FROM [dbo].[Pet] AS [t0] WHERE [t0].[PersonId] = @x1',N'@x1 uniqueidentifier',@x1='E069349B-9088-4136-A84D-B0A11E69EAD9' exec sp_executesql N'SELECT [t0].[PetId], [t0].[PersonId], [t0].[Species], [t0].[Petname] FROM [dbo].[Pet] AS [t0] WHERE [t0].[PersonId] = @x1',N'@x1 uniqueidentifier',@x1='971E0FC7-E1E7-4205-BEB6-F51FAF1560CF' exec sp_executesql N'SELECT [t0].[PetId], [t0].[PersonId], [t0].[Species], [t0].[Petname] FROM [dbo].[Pet] AS [t0] WHERE [t0].[PersonId] = @x1',N'@x1 uniqueidentifier',@x1='18986A72-C948-4C43-9498-D4060B721273'
I was used to work around this sometimes by iterating the first result, collecting the primary keys, then making one single select with a IN ( Id1, Id2, ... )-Clause to get all the related entries to insert the new relationships based on their foreign-Key from this select. But thats only for some performance considerations..

Now it September 2008 and we got the first Service Pack of Visual Studio 2k8 coming along with the first release of ADO.NET-Entity Framework... Long Story short... heres the same example with EF:

the desginer looka really similar to Linq-Designer (and currently to me it does not seem to provide that much more than linq by this release). But using these classes to traverse their relations is a little bit different to linq.. while linq will auto-expand their relations when they are enumerated - EF does not until you use the Load Method or you tell EF to Include (-Method) the relation (please correct me if I missed something at this point). Again Long Story short - the Code:
 listBox.Items.Clear();
 using (var ctx = new BlogDBEntities())
 {
     foreach (var person in ctx.Person.Include("Pet").Include("Car").OrderBy(k => k.Firstname))
     {
         listBox.Items.Add(String.Format("Person {0}, {1}", person.Lastname, person.Firstname));

         foreach (var pet in person.Pet)
             listBox.Items.Add(String.Format("  Has Pet {0} called {1}", pet.Species, pet.Petname));

         foreach (var car in person.Car)
             listBox.Items.Add(String.Format("  Has a {0} {1} with sign {2}", car.Brand, car.Type, 
car.Sign)); } }
Besides the easier-to-use Syntax of Include instead of the DataLoadOptions, by looking under the hood it turned out that EF does face the problem in a different way than linq here.. instead of avoiding a large resultset EF didn't seem to care a lot about that so the resulting Select-Statement was...
SELECT 
 [UnionAll1].[PersonId] AS [C1], 
 [UnionAll1].[Firstname1] AS [C2], 
 [UnionAll1].[Lastname] AS [C3], 
 [UnionAll1].[C2] AS [C4], 
 [UnionAll1].[C1] AS [C5], 
 [UnionAll1].[C3] AS [C6], 
 [UnionAll1].[PetId] AS [C7], 
 [UnionAll1].[Species] AS [C8], 
 [UnionAll1].[Petname] AS [C9], 
 [UnionAll1].[PersonId1] AS [C10], 
 [UnionAll1].[C4] AS [C11], 
 [UnionAll1].[C5] AS [C12], 
 [UnionAll1].[C6] AS [C13], 
 [UnionAll1].[C7] AS [C14], 
 [UnionAll1].[C8] AS [C15], 
 [UnionAll1].[C9] AS [C16]
 FROM  (SELECT 
     CASE WHEN ([Extent2].[PetId] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C1], 
     [Extent1].[Firstname] AS [Firstname], 
     [Extent1].[PersonId] AS [PersonId], 
     [Extent1].[Firstname] AS [Firstname1], 
     [Extent1].[Lastname] AS [Lastname], 
     1 AS [C2], 
     CASE WHEN ([Extent2].[PetId] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C3], 
     [Extent2].[PetId] AS [PetId], 
     [Extent2].[Species] AS [Species], 
     [Extent2].[Petname] AS [Petname], 
     [Extent2].[PersonId] AS [PersonId1], 
     CAST(NULL AS int) AS [C4], 
     CAST(NULL AS uniqueidentifier) AS [C5], 
     CAST(NULL AS varchar(1)) AS [C6], 
     CAST(NULL AS varchar(1)) AS [C7], 
     CAST(NULL AS varchar(1)) AS [C8], 
     CAST(NULL AS uniqueidentifier) AS [C9]
     FROM  [dbo].[Person] AS [Extent1]
     LEFT OUTER JOIN [dbo].[Pet] AS [Extent2] ON [Extent1].[PersonId] = [Extent2].[PersonId]
 UNION ALL
     SELECT 
     2 AS [C1], 
     [Extent3].[Firstname] AS [Firstname], 
     [Extent3].[PersonId] AS [PersonId], 
     [Extent3].[Firstname] AS [Firstname1], 
     [Extent3].[Lastname] AS [Lastname], 
     1 AS [C2], 
     CAST(NULL AS int) AS [C3], 
     CAST(NULL AS uniqueidentifier) AS [C4], 
     CAST(NULL AS varchar(1)) AS [C5], 
     CAST(NULL AS varchar(1)) AS [C6], 
     CAST(NULL AS uniqueidentifier) AS [C7], 
     1 AS [C8], 
     [Extent4].[CarId] AS [CarId], 
     [Extent4].[Brand] AS [Brand], 
     [Extent4].[Type] AS [Type], 
     [Extent4].[Sign] AS [Sign], 
     [Extent4].[PersonId] AS [PersonId1]
     FROM  [dbo].[Person] AS [Extent3]
     INNER JOIN [dbo].[Car] AS [Extent4] ON [Extent3].[PersonId] = [Extent4].[PersonId]) AS [UnionAll1]
 ORDER BY [UnionAll1].[Firstname] ASC, [UnionAll1].[PersonId] ASC, [UnionAll1].[C1] ASC
which of course returned a large resultset with many redundant entries. Since EF should know nothing about the actual data before selecting I dont assume that there might be any Statement-Generation-Changes as an optimization based on for example the amount of entries in the tables (which of course the database will do by its own). I've added the query plan here for your interest


I dont consider this as a problem for the database but it might be a problem if you'd fetch a larger resultset via network (besides the short-time client-/serversided increased memory consumption).

posted on Tuesday, October 07, 2008 1:00:20 PM (W. Europe Daylight Time, UTC+02:00)  #    Comments [0]
-