Bug Vanquisher

8 March 2008

Singularity, finally

Filed under: Dev inside! — Tanveer Badar @ 7:44 PM

The word is: "Singularity is finally out".

Get it from here.

13 January 2008

To Ref or Not To Ref

Filed under: Dev inside! — Tanveer Badar @ 8:22 PM

Shall I write void func( ref System.Uri ) or void func( System.Uri )?

Should

class something
{
     std::string* ptr;
};

or be done like this

class something
{
     boost::smart_ptr< std::string > ptr;
};

That, people, is the question!

Before answering this question, let us clarify some terms.

1- value semantics: upon assigning an object to another, its bit representation is copied to the assigned object, unless something fancy like copy constructor intervenes.

2- reference semantics: no object really contains what it says but has only a pointer to the actual bit representation, upon assignment this pointer is updated and both objects point to the same location in memory.

3- pass-by-value: parameter initialization obeys value semantics, newly initialized object contains an exact copy of initializer object, unless something fancy like copy constructor intervenes.

4- pass-by-reference: parameter points to the same object as the initializing object and remember the joke in java, I don’t know the current status

void swap( int a , int b )
{
    int temp = a;
    a = b;
    b = temp;
}

5- value types: value type obey value semantics by default but can be made to obey reference semantics by using explicit references or pass-by-reference across function calls.

6- reference types: by default they obey reference semantics and there is no known way to make them obey value semantics without some really fancy coding. This fancy coding typically requires that a reference type may exist in two forms, one mutable and other frozen. When value semantics are required you freeze an object and then, assignments make copies of it instead of setting the underlying pointer. But this requires fancy coding by someone.

The battle between value types and reference types is very much like "Men are from Mars and Women are from Venus’". Each has its merits and demerits. A comparison follows with examples from C++, C# and Java.

Value Types Reference Types

1- Default in C++.

1-Default in almost every other language.

2- Derived objects suffer slicing.

2- References are default.

3- Unnecessary copying, hidden costs.

3- No copying, unless explicitly asked.

4- Copy constructor often necessary, along with assignment and destructor.

4- Copy constructor1 semantically not present.

5- Explicit sharing of objects, no reference counting headaches2.

5- Multi-threading nightmare.

6- Helps in proper resource ownership3.

6- Mad COW4!

7- Function calls have pass-by-value as default.

7- Simulated pass-by-value can be a counter-intuitive.

8- References are step-child5.

8- Values types suffer6 from performance issue.

9- Destructors often have proper semantics.

9- Destructors entirely missing from picture, no deterministic clean-up.

10- Memory management is explicit (C++) or absent (CLR)7.

10- Typically, a garbage collection runs behind the scenes.

11-Nullable values types are often required when interfacing with cross-domain functionality.

11-Not nullable reference types are under active research.

[Nitpicker’s corner]
1- Memberwise clone and ICloneable do not have clearly defined semantics, they are even sort of deprecated.
2- I am not talking about COM here, it is only a core language level comparison.
3- RAII idiom.
4- I recommend you read that entire series.
5- I can’t seem to find a reference (no pun) at the moment, but it is impossible to write reference types in C++ as they are implemented in C# or Java.
6- The costs listed on that page may vanish in future, but I have plenty more reasons up my sleeve.
7- Please don’t even get me started on this!

23 December 2007

Catching the Exceptional

Filed under: Computer Theory, Dev inside! — Tanveer Badar @ 4:42 PM

[Warning: None of the code, even the simplest one, may compile. I was just demonstrating a point, code is meta-information to support that.]

Let’s do some coding. You have a simple program to calculate inverse of an integer.

using System;

namespace Program
{
    public class Exceptional
   {
       public static void Main( )
       {
           Func( );
       }

       void ShowInverse( int num )
       {
           Console.WriteLine( 1.0 / ( double )num );
       }

       void Func( )
       {
           Console.WriteLine( "Please enter a number: " );
           try
           {
               int num = int.Parse( Console.ReadLine( ) );
               ShowInverse( num );
           }
           catch( Exception ex )
           {
               Console.WriteLine( ex.Message );
           }
        }
    }
}

I do not claim this program to be the best ever. You will see what I am trying to do. That int.Parse line can fail for any number of reasons. That’s why it is surrounded by a try-catch clause. So, we got rid of formatting errors for the moment. But this leaves one problem wide open. What happens if input is ‘0’? We may need to add another try-catch in ShowInverse. So the revised code for ShowInverse becomes

       void ShowInverse( int num )
       {
           try
           {
               Console.WriteLine( 1.0 / ( double )num );
           }
           catch( Exception ex )
           {
               Console.WriteLine( ex.Message );
           }
       }

This is essentially equivalent to having a nested try-catch block inside the one in Func. You get the idea how we nest try-catch blocks to deal with different syntactic/semantic errors.

Now, let’s get a bit abstract, our Exceptional class has some functions which encapsulate some business logic and call each other inside try-catch clauses. Each of them has DB logging for success, in case of exceptions and in their finally blocks.

using System;

namespace Program
{
   public class Exceptional
   {
       // logging logic all goes inside Logger.Log function, which will take only a string.
       Logger log = new Logger( );

       void Func1( ... )
       {
           try
           {
               log.Log( "Executing Func1" );
               ...
               Func2( ... );
               log.Log( "Ending Func1's try block." );
           }
           catch( Exception ex )
           {
               log.Log( "Error occurred inside Func1 " + ex.Message );
           }
           finally
           {
               log.Log( "Exiting Func1." );
           }
        }

       void Func2( ... )
       {
           try
           {
               log.Log( "Executing Func2" );
               ...
               log.Log( "Ending Func2's try block." );
           }
           catch( Exception ex )
           {
               log.Log( "Error occurred inside Func2 " + ex.Message );
           }
           finally
           {
               log.Log( "Exiting Func2." );
           }
        }
    }
}

We log everything, even the exceptions to a database. Here’s a pop quiz, what happens in the case when original exception was thrown because underlying database is not available? Even in this case, we attempt logging and fail in our catch blocks and throw another exception.

Suppose this happened while Func2 was executing, we catch the first exception and try to log it, another exception is thrown and Func1 attempts to log it and throws yet another exception. There is no one to catch the third exception was will cause what service Exceptional was providing to fail.

Is the answer to the problem is to add yet another try-catch clause inside all existing catch/finally blocks? Definitely no. What if someone else comes along later and sees no logging in catch/finally block! I’ll add that. And we are back to the same problem we were attempting to solve.

Suppose, this code was part of a console application. Eventually, the exception would surface to operating system loader which will notice no one has handled it. For Windows, you can find the details in MSDN about what happens when program throws an unhandled exception, but operating systems follow similar patterns. If everything fails, program is terminated. If this happened in kernel mode, you lose everything.

If you pause to think for a moment, you will see our exceptional handling code forms a hierarchy. We have various levels of handlers from swallowing everything type to log and rethrow to transforming ones. At lower levels, you have all sorts of choices from retrying the operation to keeping things silent to invoking some alternative or compensating functionality.

The one which causes the program to terminate is the most beautiful. It is the root which must withstand the worst of all. It is unique in the way that nothing passes this handler, nothing can ever pass this one. It handles everything where others fail.

19 December 2007

I Hate foreach

Filed under: Bugz, Dev inside! — Tanveer Badar @ 3:42 PM

There are many reasons. People don’t implement it properly. foreach uses duck typing which does not require one to implement IEnumerable/IEnumerator. But people always go ahead and implement them with "virtual" functions and properties. Consider the performance hit when you are recursively iterating over all the files in a drive.

foreach also requires that collection be not modified during enumeration, yet people still do these kinds of things

foreach( Transaction transaction in transactions )
{

transactions.Remove( transaction );

}

in [it].

[It] is Done

Filed under: Dev inside!, Personal — Tanveer Badar @ 3:33 AM

It is too late for me to be writing anything this late (3:30 A.M.). But, [it] is shipping later today in the afternoon.

What is [it]? It [it] is something I have been working on for the whole year. I’ll talk about [it] later someday. It [It] is repeated and in [ ] because [it] is not the pronoun, it is [it].

[Update: 20/12/2007. [It] did not ship yesterday due to some issues with [them]. And I feel sorry for those who hate nested [ ]. Yep, I just did that again. [:)] ]

8 December 2007

BigInt For .Net Framework

Filed under: Dev inside!, Tips — Tanveer Badar @ 11:43 PM

Is it there or is not?

The answer was true for beta 1 build of netfx 3.5. It is true if you can manage to run under full trust for beta2 and RTM versions, because then, you can use reflection to create an instance of BigInteger and invoke members.

image

This value type supports +, -, –, ++, /, ==, >=, >, !=, <, <=, %, *, -, + as supported arithmetic operators.

For functions, it has Abs, Add, Compare, CompareTo (overloaded), Divide, DivRem, GreatestCommonDivisor, ModPow, Multiply, Negate, Parse (overloaded), Pow, Remainder, Subtract, ToByteArray (overloaded), ToString (overloaded) and TryParse (overloaded) to offer. Note, Add, Divide, Multiply, Negate, Remainder and Subtract actually implement functionality for corresponding operators.

There are also a host of convert-from and convert-to operators, performing conversions to and from all built-in numeric data types.

16 November 2007

Super Cool!!!

Filed under: Dev inside!, Fun — Tanveer Badar @ 8:23 PM

It seems my love is not the only love around. People are crazy about their stuff too.

.NET – A Love Story

.NET Love Story Continues…

3 November 2007

Building The Tools

Filed under: Computer Theory, Dev inside! — Tanveer Badar @ 1:12 PM

Connected with the extremely twisted last post, consider the methods of programming the machines we have seen since 1943. Let us build a hierarchy of tools we use for programming and how they must have come to be what they are right now.

ENIAC was programmed using rewiring and four years later than that for ROM function tables. Only six women programmed it. Compilers existed long before that. Assemblers came out a little late. We see the first portable programming language in 1972. High level languages soon follow, my first love enters the picture, the world is swarming with all sorts of OO and high level languages by mid 1980s. Declarative languages add to the mix at this time too. Right now, we are partying on very expressive programming environments.

Grab Visual C++ compiler, the language I use most. Or better yet csc.exe. Consider how that compiler itself must have been built. If you have seen rotor, you will know that csc itself is written in C++. Therefore, it will require a C++ compiler, definitely cl.exe because of this. Put cl.exe into the interrogation box and ask the same question, who is your mother? It will say cl.exe with a build number or version number smaller than mine.

But this isn’t a chicken and egg problem because it must have begun at the first build of cl.exe. Let’s say it was compiled with a C compiler. Then, who was the mother of that compiler? Some earlier C compiler. Who gave birth to that first one? An assembler is the most likely answer, say masm.exe. The question repeats itself, only this time in the context of assemblers. First assembler would have built the second and itself must have been built by some machine which required programming by re-wiring (only a guess, I have no proof :)).

This machine may have built a few assemblers. Those assemblers must have boot-strapped themselves and eventually one of them may have become powerful enough to build a C compiler. (Actually, from what I just read, it goes way back than that assumption. C was a transformation of B which had a compiler who origin was in BCPL’s compiler. But let us again skip the details and say BCPL compiler may have been built with an assembler.)

This early compiler must have boot-strapped itself too and built more powerful versions of it and other compilers. Definite proof for this is the C-front, first compiler for C++ which produced C code as a result which was compiled and executed. The process has been bootstrapping itself ever since and giving rise to more and more language compilers of all sort and hence, we trace the path from root to csc.exe!

If you look carefully, you will find my favorite data structure here too. Conceptually, ENIAC or something similar lies at the root. At every level, a transformation program (assemblers/compilers) is either boot-strapping itself or increase the set of available transformation programs at the next level. From ENIAC, we move to a decent number of assemblers, which boot-strap themselves to assemble BCPL (again, my hypothesis) compiler. BCPL compiler compiles B compiler which boot-straps itself and generates a C compiler. This C compiler again bootstraps and really makes the bomb go off by giving rise to other C compilers and C-front. And again, the rest is history.

29 October 2007

The Kid Left Alone

Filed under: Dev inside! — Tanveer Badar @ 2:11 AM

Microsoft is going to ship LINQ next year. This statement is as sound as ‘US Army is not going to leave Iraq in the next 10 years”.

With LINQ they added a very power data sub-language to the two pet languages of theirs, C# and VB.Net. VC should not complain. They have their own problems to fry with things like re-implementing the code model for VS 10, improving the performance for common usages, expose C++ grammar’s AST programmatically and the miscellaneous. They got Class Designer included in VS 2008 Beta 2 with its own set of problems because System.Windows.Documents.FixedDocument cannot handle extremely wide XPS documents.

But I haven’t discussed one child of Microsoft’s parentage which will need LINQ support and that is ‘PowerShell’. Powershell has the capability to embed C# code (and if I am very much wrong, only C#) as its script which allows it to do almost anything a regular program can do. This means, when out-of-box commandlets are insufficient you can go straight for C# as your first choice and resort to more esoteric means if need by.

But surely, if you can do anything with C# in it, you should be able to mix-in a little of LINQ too. SilverLight 1.1 will have LINQ support,

From SilverLight 1.1 documentation

But I haven’t heard any news about LINQ support in PowerShell.

So why leave this particular kid alone?

P.S. I thought I should note that some of the links point to bugs I have filed on connect.

14 October 2007

LINQ/Entity Framework vs. nHibernate

Filed under: Dev inside! — Tanveer Badar @ 1:40 PM

To begin with, let me elaborate what nHIbernate is in the first place. nHIbernate is a persistence library written solely for .net framework, modeled after Java’s Hibernate library that provides services to persist .net framework  objects to and from an underlying database, in a very concise manner of speaking.

The framework requires you to map tables to classes which it calls persistent classes or entities. This mapping is done by XML files which map specific columns to properties. Most visible feature is that you never have to write SQL statements in your code. Apart from that you never need to worry about referential integrity anymore because it maps foreign keys pretty efficiently and provides easy access to other tables through property notation.

Microsoft came out very (very, very, very … ad-infinitum) late with a object persistence framework of their own. This framework is the ADO.Net Entity Framework. This framework also provides similar facilities. To say the least, you don’t have to worry about joins and foreign keys ever again (only if the generator tools work correctly).

Entity framework will very heavily compete with nHibernate. Both provides similar services to developers. A developer using either can forget about SQL (after proper entity generation of course) and use object/property syntax to save and load data. But the problems for nHibernate don’t end here.

To add insult to injury, Microsoft will ship LINQ next year with VS 2008. LINQ provides powerful capabilities to write inline queries in HLL like C# and VB. C++/CLI can also benefit but only from LINQ method syntax. LINQ to SQL and LINQ to Entities will make life still easier for developers. While you have to do this in nHIbernate to select all customers who have not performed any transactions yet:

ISession session = factory.OpenSession( );

try
{
return session.CreateCriteria( typeof( Customer ) ).Add( Expression.NotEq( “DebitAccount” , customer ) ).List<Customer>( );
}
finally
{
session.Close( );
}

On the other hand, when you write the same query in C#/LINQ, the syntax is much more like SQL:

var query = from customer in CUstomers
from transaction in Transactions
where customer.Account != transaction.DebitAccount
select customer;

We can clearly see that everyone is going to prefer LINQ/Entity framework dual delights in comparison to nHIbernate.

Of course, Microsoft has the unfair advantage is owning C# compiler, they can make whatever changes they like to the language and be done with it in over two years only. Just to support LINQ they have

  • added anonymous types, I could have written select new { customer.Name }; if I wanted only the names and this would result in a projection when SQL was eventually generated.
  • lambda expressions, when using method syntax I would have written .Where( (customer,transaction)=>customer.Account != transaction.DebitAccount ).
  • extension methods, LINQ can be applied to any thing that implements IEnumerable<T>, this is possible because of public Where( this IEnumerable<T> , Func<bool,T> ).
  • type inference, noticed that var query? Compiler automatically infers the type of query from right hand side. Also (customer,transaction)=>customer.Account!= transaction.DebitAccount translates to a delegate which returns a bool with two unbound parameter types that are resolved at runtime.

nHIbernate (or any other persistence library for that matter) could never hope to compete with Microsoft on account of the lengths they would go to retrofit their languages with query features.

[Edit: (17/06/2009) In retrospect, this post should never have been written in the first place. The more I read about EF, the more incomplete it felt. EF still has to go a long stretch to even hope for any competitive market share, only a microsoft n00b enthusiast would recommend EF in any sensible enterprise level application. There are too many things you cannot do in EF which are lingua franca in hibernate/nHibernate. A dismal failure on microsoft’s part if one may.]

« Newer PostsOlder Posts »

Blog at WordPress.com.