Bug Vanquisher

8 June 2007

Reflection on Reflection: Function.Invoke/MemberFunction.Invoke

Filed under: C++ — Tanveer Badar @ 4:44 PM

Consider a language with no namespaces, no generic coding paradigms, just class hierarchy support. And for that matter consider no multiple inheritance to simplify things. To complicate things, functions can be overloaded but runtime polymorphosim is forbidden. Now, let your thoughts run free to the idea that all type information compiler found at compile time is also available at runtime through a special function GetType( ) on every class. This returns a Type object which you can query for all functions and call Invoke for a particular method on that object.

With this background, lets converse a bit about calling the functions. All you have available is the name of function to call and the meta data about the type. This function must perform these steps, in order.

  1. Validate its arguments.
  2. If it called on MemberFunction, search through the inheritance hierarchy for the function. Stop this procedure as soon as at least one matching function is found, otherwise, search only in the global scope.
  3. If no function was found in the last step, throw a notfoundexception.
  4. For all the functions in the found set, check their number of parameters and reject all those whose parameters are less than given.
  5. If no functions are left, throw an argumentmismatchexception.
  6. For the remaining functions, reject all candidates which take more parameters and do not have default values for extra parameters.
  7. If no functions are left, throw an argumentmismatchexception. By this time, we only have those functions left whose number of arguments are same or more with default values.

  8. Perform overload resolution among the candidates.
    1. Consider one function at a time. Perform positional type coercion for each parameter. Prefer implicit conversions and promotions as to user defined conversion. Also, user defined conversions must be discovered through meta data which will incur a performance hit everytime. Record the number of conversions required. Also perform coercion on return type. If return type can not be converted, reject that function.

    2. Those functions which require the minimum number of conversions are the best candidates.
  9. If no functions are left, throw an argumentmismatchexception.
  10. If we are left with more than one function, through an ambiguousmatchexception.
  11. If the function can not be accessed by the caller, throw an uncallableexception.
  12. Perform type coercion on parameters for the candidate function.
  13. Perform type coercion on return value of the candidate function and return it to caller.

If we allowed more generality than what was excluded, the conditions would have been much worse.

  1. With namespace, there is single global scope. You must have a Type object for each namespace. Overloads can be found at any level of namespace hierarchy too.

  2. With generic code, there is no way to instantiate a new class each time you specify parameters for which a concrete instantiation does not exist. Therefore, an additional check must be done to see that arguments match the type of object that are called on. Also, those arguments which participate in generic type deduction cannot have type coercion applied to them. A generic type argument inferable through multiple parameters must be same for every individual inference.

  3. You want multiple inheritance, only more work for Invoke. It has to search all branches of base classes simultaneously. If the function was declared in multiple base classes, it should throw an ambiguous match exception. Therefore, it must construct a tree rooted at the type of calling object and traverse it depth first to find the candidates and perform overload resolution among them.

  4. If someone thought virtual functions are a must, then, how are you going to invoke the most derived implementation when all you have is only a pointer to the function implementation at most till this level of inheritance. They must be invoked through the exact reference to the calling object and no other way.

6 June 2007

Get indulged

Filed under: Fun, WPF — Tanveer Badar @ 6:23 PM

a

b

c

d

Get spacetime3d. Get indulged. The wonders of 3D browsing hit desktops for the first time. I wonder if someone can do a similar thing in WPF, now with the InteractiveVisual3D library?

Reflecting on Reflection

Filed under: C++ — Tanveer Badar @ 5:50 PM

Don’t tell me you never did things like obj.GetType( ) or typeof( obj ). Everyone does, admit it. But have you ever thought what goes behind the scenes of all this raw power? If you had to implement such a system yourself what design decisions would you make?

Now that you have admitted about typeof( obj ), tell me if you ever wondered who moron wrote this enumeration?

[Flags] 
System.Reflection.BindingFlags 
{ 
    CreateInstance, 
    DeclaredOnly, 
    Default, 
    ExactBinding, 
    FlattenHierarchy, 
    GetField, 
    GetProperty, 
    IgnoreCase, 
    IgnoreReturn, 
    Instance, 
    InvokeMethod, 
    NonPublic, 
    OptionalParamBinding, 
    Public, 
    PutDispProperty, 
    PutRefDispProperty, 
    SetField, 
    SetProperty, 
    Static, 
    SuppressChangeType 
}

Why are all these access control flags mixed up with things like overload resolution and instance/static methods? Where does all the meta data about a type go? (Well in the meta data dictionary! Where else would it go?) Does MethodInfo.Invoke perform overload resolution? How are arguments coerced if the types don’t match exactly? Why do we seem to have a separate class for almost every lexical scoping construct? Why can you do this if you have proper access

class a 
{ 
    int member; 
} 

typeof( a ).GetField( "member" , BindingFlags.NonPublic | BindingFlags.GetField ).GetValue( );

but not this

class a 
{ 
    int member; 
} 

a obj = new a( ); 

Console.WriteLine( a.member );

Enough questions, lets discover the reasoning behind them in CLR. First the why be a moron? That moron made a really good decision, all these things would have required a separate slot in the class, just clump them together in a bit field to save considerable space. The meta data goes in a meta data dictionary in your assembly. Reflection APIs read it from there. Type size is essentially reduced.

And did you know that in IL you refer to an entity by its ordinal in some table. Want to call a method? Emit a call instruction on the object/class with the argument equal to its slot. Instantiating some class? Emit a call to newobj with the ordinal of that class in the type table.

Method.Invoke is a big machine. It has to search all the methods which match the name depending on being case sensitive or not. Then, it must find the best method from that set using overload resolution rules which the compiler uses at compile time. Quite a bit of work for one function. And overload resolution involves type coercion and parameter matching from what was given to what is required.

All these classes are provided to match the language features. Same kind of effort goes into three places. Compiler for the language, a runtime code generation system which has a similar class hierarch and an even complex object hierarchy and the type discovery system which must match compiler’s implementation to support every lexical construct in the language.

Access to private members is allowed if you have proper permissions. Consider it from the compiler’s point of view. If it sees the second case, name lookup check succeeds but accessibility check fails. Now, consider the reflection case. “member” is just a string argument to some function for the compiler. The meta data is already available for anyone to use if they care to. Therefore, if you can get your hands at the meta data and have appropriate permissions, you can access private implementation specific parts too.

I encountered all design problems because I am writing a reflection framework for C++. The language natively supports one joke and a work around. The joke is called typeid( ) operator and the work around is dynamic_cast< >( ) operator.

Considering the modern needs of runtime discovery of types, plug-in architectures, design patterns like IoC we need a strong type system and an equally strong runtime support system for type discovery and dynamically invoking members of these types. I call typeid a joke because it return type_info and there is no requirement that this type_info contains valid (don’t even think about things as high as useful) information for the object it was invoked. name( ) function may return an empty string, if a non-empty string was returned it may not necessarily correspond to the compile time name of type. You can do nothing else with a type_info apart from the name( ) and before( ) functions. before( ) does not order types lexicographically, the details are hidden from mere mortals (read programmers).

dynamic_cast is a trial and error game. You have a pointer to some base class and it is your burden to find out which exact derived class object it really is by repeatedly down casts. If you have a reference, conditions are much worse for you as your first cast must succeed otherwise you get a bad_cast exception. If you have multiple virtual base classes, dynamic_cast is the only hope, static_cast is forbidden.

Boost goes a little further than that primitive state of affairs. They provide a typeof operator which allows you to infer type at compile time. gcc also has a typeof operator which works similar to boost’s version, i.e., compile time inference of a type from some expression, nothing better than that. And don’t get me started about VC++. They are slow enough to get their partial specialization correct after five years and dependent name lookup is still messed up.

For my reflection framework, I have chosen to implement access control and declaration specifiers as bit fields to save space. Consider adding a bool for things like public, private, protected, pointer, reference, constant, volatile, template and extern or packing them all in one int. One bool for each results in 36 bytes of additional storage for just 9 bits of information which will easily fit in a four byte integer.

Dynamically invoking functions on a type must have overload resolution because C++ supports overloading. Arguments must be converted to correct type because compiler does that at compile time. In short, every aspect of function call resolution that happens at compile time must happen for invoking where possible. Things like argument dependent lookup is possible only for base classes, there is no way to influence namespace level lookup or introduce new identifier with using declarations. Similarly, template argument inference is a hard thing to do at compile time. The thing is Turing Complete at compile time, only one front end (sold only to big names like Microsoft) and an open source compiler implement them correctly, I am not going to burst my brain over it.

For the type hierarchy used in reflection framework, I have ReflectionObject, Type, Function, MemberFunction, Field and Parameter classes. Type class is abstract and my model for reflection has each class define a private implementation of Type and return the static object of that private type when GetType is called. Also I require a type to implement a static StaticGetType which returns the contained object without first creating an instance.

Since Type class contains complete information about a class/struct it will possible to access private implementation of a class if appropriate access is requested.

And they made another prediction

Filed under: Fun — Tanveer Badar @ 10:54 AM

Yesterday, I read in a newspaper that our meteorological department made another attempt of there abominable weather forecast. In reference to the highlighted portion near Oman.

weather

The news release said yesterday about today’s forecast (how could it not be forecast if it were written yesterday for today?): “We are to expect light rain and clouds in the lower parts of Sindh because of that storm”.

What really happens with their forecasts is quite the reverse. We were “To expect light rains and clouds”. Read it as, tomorrow will be hottest day of summer so far. Don’t expect a single speck of clouds to hover. There might be some heavy rains and snowing in the northern parts of Pakistan and AJK.

5 June 2007

On a wonderful day, 23 years ago

Filed under: Personal — Tanveer Badar @ 12:01 PM

Your clocks ticked 725760000 times. 12096000 minutes elapsed. It has been 201600 hours, though wikipedia says it has been 201045.6 hours. 8400 days have passed since then. Also, it is 1200 in terms of weeks. The exact number of months is 276. It was 23 years ago.

Turning to the lunar calendar, a different story awaits. Though the second, minute, day and week count remains the same, everything else has changed. The months become 285. and the years are 23.75.

These years have seen two of these.

photo sphere  aurora borealis (solar storm of 2003)

Can you guess the hints?

I am talking about me being born. Born to a family of five children whom I came to question latter “Mirror, mirror on the wall. Who is the fairest (read smartest) of them all?”. I have said much about myself before here and here.

I cheated with the timestamp!

2 June 2007

The horrors within!

Filed under: Computer Theory — Tanveer Badar @ 3:00 PM

Have you ever given thought to the C# type system (For all purposes I will only say C# whenever I mean C#/VB.Net/C++-CLI/whatever languages.)? Ever opened a bank account which requires a reference of someone having previously opened an account at the same bank? Admitted to a school which requires previously students’ parents knew you? Got membership of a club where someone else had to recommend you? Or don’t forget the politics? Know what Russell’s paradox is. If every set is a subset of universal set, of which set universal set is a subset then?

Each of these examples has a main theme, the idea of a tree structured hierarchy. You must start at the root to enter the system. There must a path from you to root or a relation to root in order to be part of the system. The root is special. It has no parent and without some external help, it would not even be a part of that system. The system is inconsistent specifically for the root. You need boot strapping to make root part of the system. Given the rules which apply for something to be part of the system, root will always violate these rules. The invariants which hold for every other thing that is part of the system, root is the sole entity for which they do not hold. Let me explain all these points with reference to the questions I asked.

  1. C# type system: C# requires the each type must have a parent. If you do not specify a parent, it will implicitly inherit it from either Object if you define a class or from ValueType if you define a struct. ValueType in turn inherits from object. Therefore, whatever the construct you always inherit from Object. Java and many other languages suffer from the same problem. In my knowledge, C++ is the only language to avoid this problem by not requiring every class to inherit from one single class. Now, consider my claims.

    • You must start at the root to enter the system. There must a path from you to root or a relation to root in order to be part of the system.
      As said earlier, no matter how you define your type, you always inherit from object.
    • The root is special. It has no parent and without some external help, it would not even be a part of that system.
      object is special. It is the base class for every other class in C# but has no parent itself. Take a look at the disassembly we can peek at with reflector.

      [Serializable, ComVisible(true), ClassInterface(ClassInterfaceType.AutoDual)]
      public class Object
      {
          // Methods
          [ReliabilityContract(Consistency.WillNotCorruptState, Cer.MayFail)]
          public Object();
          public virtual bool Equals(object obj);
          public static bool Equals(object objA, object objB);
          private void FieldGetter(string typeName, string fieldName, ref object val);
          private void FieldSetter(string typeName, string fieldName, object val);
          [ReliabilityContract(Consistency.WillNotCorruptState, Cer.Success)]
          protected override void Finalize();
          private FieldInfo GetFieldInfo(string typeName, string fieldName);
          public virtual int GetHashCode();
          [MethodImpl(MethodImplOptions.InternalCall)]
          public extern Type GetType();
          [MethodImpl(MethodImplOptions.InternalCall)]
          internal static extern bool InternalEquals(object objA, object objB);
          [MethodImpl(MethodImplOptions.InternalCall)]
          internal static extern int InternalGetHashCode(object obj);
          [MethodImpl(MethodImplOptions.InternalCall)]
          protected extern object MemberwiseClone();
          [ReliabilityContract(Consistency.WillNotCorruptState, Cer.Success)]
          public static bool ReferenceEquals(object objA, object objB);
          public virtual string ToString();
      }

      As I have highlighted, object has no base class. Even the IL does not have an .extends directive.

    • The system is inconsistent specifically for the root.
      If we try to compile the above define (suitable completed and removing internal dependencies for the sake for demonstration), C# compiler will derive it from object defined in mscorlib.
    • You need boot strapping to make root part of the system. Given the rules which apply for something to be part of the system, root will always violate these rules.
      Since, object cannot be defined in any HLL, we need something external to the targeting languages to define object. If we insist on defining it in C#, it will no longer be the object we intended it to be.

    • The invariants which hold for every other thing that is part of the system, root is the sole entity for which they do not hold.
      There is one invariant no type can violate. It must have a single parent. object has no parent

    I am going to clump all other example together as they are not related to programming. However, banking deserves special attention.

  2. Open a bank account/Get admitted to a school/Club membership/Politics
    • You must start at the root to enter the system. There must a path from you to root or a relation to root in order to be part of the system.
      You must know someone who is already a member. Someone must invite you into the system, you cannot enter solely on yourself.
    • The root is special. It has no parent and without some external help, it would not even be a part of that system.

      This is easiest to explain in terms of a bank account. Consider, if everyone needed a reference to someone who already had an account, how could the first person open his/her account in the first place? Something external to the system that transcends the requirement that you must know a person beforehand who is part of the system must have played a factor.

    • The system is inconsistent specifically for the root.
      If you demand relationship, root will fail the test and the whole system collapses.
    • You need boot strapping to make root part of the system. Given the rules which apply for something to be part of the system, root will always violate these rules.

      Root does not know anyone who previously had an account in a bank, he/she should not be allowed to open an account, but the system requires at least one member who will introduce others to the system.

    • The invariants which hold for every other thing that is part of the system, root is the sole entity for which they do not hold.
      System must not require root to fulfill relationship to some parent in order to bootstrap in the first place. Every object must have a parent, root is excluded. Period.

All these are examples of chicken and egg problem. Joel Spolsky has a very nice post somewhere (I found the link, it is here) buried in his valuable archives which discusses the same problem from marketing point of view. He talks about things like, a new application requires a large customer base to be economical and have sufficient customer feedback to achieve a high quality but most customers don’t have huge buying strengths and most won’t waste money on something that just came out of company yesterday.

Russell’s paradox asks for a set which is ever bigger than the universal set. The ‘one who is all’. That set must contain everything yet must be contained in a set which is a proper superset.

Consider another example that just jumped out in front of me. A class of languages which can be recognized by a particular machine can always be defined in terms of a sort of meta language. When that describing language is fed to this machine, it fails to recognize it. All regular languages can be defined in terms of any language which will not be regular itself. Next, PDAs can be defined in language which PDAs will never recognize but TM/PM/nPDAs will. Turing machines will never recognize the language which describes them. In general, when we define a meta language to encompass all language a particular machine can recognize that language will transcend that machine. This can be proven by Cantor’s diagonal argument.

[Edit: 11/10/2009]

As Eric Lippert points out, not every type is C# inherits from object. Interfaces and pointers.

Secondly, ponder this in mscorlib:

throw new ExecutionEngineException(“mscorlib.resources couldn’t be found! Large parts of the BCL won’t work!”);

This message can never be localized, try as you may.

Third: Thank microsoft for investing in minwin (aka kernelbase.dll) in Windows 7. Otherwise, ntoskrnl.exe and hal.dll both claim to be the root of dependency hierarchy in Win32 processes.

Run applications during Windows Vista (onwards) setup

Filed under: Fun, Tips, Windows Vista — Tanveer Badar @ 1:52 AM

It’s been a tip season and I am longing to tell you this piece of information for ages!

You can run light weight applications during Windows Vista setup and even things like explorer after the second reboot.

This magic works because Vista’s setup runs in a mini Windows installation called Windows PE-Windows Pre execution Environment. This is a full blown Windows environment that exposes just enough facilities to give you a 32/64-bit GDI based GUI. I tried to take a picture to show you how to do but save/save as in paint didn’t work. Due to COM registration issue in that part of setup, showdocvw.dll was not registered at that point and explorer gave an error (refused to lanuch) and there was no way to save the snap-shot. However, after the second reboot, I was able to run things like cmd.exe, taskmgr.exe, explorer.exe and devmgmt.msc and GOD devmgmt.msc showed devices installing in real time.

The trick to do stuff like this starts with getting a hold of command prompt which can be run by pressing SHIFT+F10 keys. Now, this step is document and recommended only if you ‘really‘ need to have access to a command prompt. I went ahead to tried to run things from that command prompt and most worked. Next you have a choice to run an application of your choice from the command line. Remember that application should not rely on things like COM registration or OLE automation. These things are not initialized properly during that stage of setup.

1 June 2007

An excellent article on dependency properties

Filed under: Tips, WPF — Tanveer Badar @ 5:44 PM

Right from the bowls of WPF SDK team. A superb two article post on dependency properties.

Wha’ happened? Property-changed detection mechanisms in WPF (Part One)

Wha’ Happened Part Two: More Property Changes in WPF

[Edit: A reader rightfully pointed out the mistake I had made thinking these articles were about animations, however, they are about dependency properties and only indirectly about animations because dependency properties can be animated.]

« Newer Posts

Blog at WordPress.com.