Bug Vanquisher

19 October 2007

Major Changes in CPPCodeProvider

Filed under: C++ — Tanveer Badar @ 11:40 AM

So far, if you have noticed CPPCodeProvider did not have a concept of separate declaration/definition for the various objects it supported. Apart from that it was not possible to render UserDefinedType, mainly because I was being lazy all this time.

Of course, if you are wondering what the hell CPPCodeProvider is, here are some links which explain it in much more details: part 1, part 2, part 3, part 4, part 5 and part 6.

One day, I asked myself how to render a udt (user defined type). There are all sorts of issues with a udt. There is an ordering among the members of a type. All enumerations, unions, member types and typedefinitions must be (at least) declared before their first use, otherwise, compiler whines. But they cannot be defined at that point because of the dependencies they themselves have upon each other. In addition, unnamed enumeration and unions must be defined completely at that point because they have no name associated with them and a declaration can not be rendered.

This means rendering has to be done in two phases to be safe. First, all types are declared and then, functions, operators, constructors, destructors are declared if they are not inline functions. After that these declared items are defined out of band only if they are not inline. Member types have a similar requirement, unless they specify that they must be defined inline, their definition is always moved out of class.

Member variables are defined and declared inline unless they are static but not constant and do not have integral type.

A member type that inherits from containing type can never be defined inline, so there is a check to override inline specification for member types if one of their base types coincides with the container.

This coves one major change. The other one is the separation of declarations from definitions. For example, write in CodeObject is overloaded and the two declarations are:

void write( std::wostream& os , unsigned long tabs ) const;
void write( std::wostream& declos , std::wostream& defos , unsigned long decltabs , unsigned long deftabs ) const;

As always, these functions call their respective virtual counterpart overloads writetext. write with two arguments always renders anything inline while the other function separates declaration from definition.

This means, it is now possible to have a header containing all the function declarations and class definitions and have their definitions/implementations in a separate file. However, if a class/function is templated you will not be able to override correct behavior with either function and they will be defined instead of declared. Second overload of writetext will refer to the first one in that case passing only declos/decltabs for correct rendering.

One major feature I still miss is to properly support partial and complete specialization. Even during writetext implementation for udts I faced that challenge of how the user will be able to pass template parameter information to base classes and member variable declarations. Expect to see these issues fixed in a future update as someone does have to wait.

9 June 2007

Runtime Code Generation in C++ 6

Filed under: C++ — Tanveer Badar @ 3:38 PM

As I said last time, no discussion of a language is complete without statements and expressions. We talked about expressions in CppCodeProvider last time. Today’s topic is statements. Everything in C++ is a statement. You declared a function or a variable, it is a declaration statement; define a class, it is another declaration statement; define a function, another declaration statement; write a loop, iteration statement; use if-else or switch, conditional statement; perform any type of jumping by goto, return, break or continue all are jump-statements; declare variable on a for-loop, a for-loop init statement; label something or write case or default inside a switch, labeled statement; surround a piece of code with braces, compound statement; use a semi-colon after an expression, expression-statement!

In short everything is a statement in C++. There is simply no escaping this fact of life. My life is complicated by the compound statement thing. Since anything can be a statement and come after anything in either braces or otherwise, every class to represent a type of statement must derive from same base class, Statement.

First there is an ExpressionStatement, which takes an Expression as an argument and turns it into a Statement. There is a UsingStatement, which declares an identifier to be visible in another scope. We are used to jumping all over the place, all our lives. JumpStatement captures the essence of jumping. It takes either a string to generate a goto, a member of JumpType except goto, or an optional expression to generate a return. It is illegal to specify Goto as an argument to second constructor because a goto requires a label to jump to. The expression passed to the third constructor must not be null because for a simple return with no value you should pass JumpType::Return to the second constructor. To accompany the goto of JumpStatement, we have a LabelStatement which takes a string and another statement and applies that string as a label to that statement.

We have discussed all the single statement constructs for statements. It is time to move on to statement blocks. It is legal to have a single statement block, but it will be surrounded by { } and introduces a new lexical scope. For compound statement, you use StatementBlock. StatementBlock can be instantiated with a collection of statements which will be part of it or default initialized. All statements which are part of one StatementBlock can be accessed through Statements( ). This class is the base class for almost every other kind of statements.

A catch block is essentially a compound statement with one optional exception variable declared and initialized before the scope is entered by execution. CatchClause represents this structure in code graph. It optionallly takes a VariableDeclaration or default initialized if this catch block is a catch-all type. The exception variable is accessible through CatchType and since it inherits from StatementBlock all statements are also accessible.

We use CatchClause to build higher level construct, namely try-catch block. A try-catch block has a body of statements surrounded by try and a list of catch clauses which catch exceptions thrown by try block. TryCatchCaluse is always default initialized and you add statements to try block with Statements( ) and catch blocks through CatchClauses( ). Both of these functions return an appropriate collection where you can call push_back to add new items.

A case statement also inherits from StatementBlock. It takes an optional bool parameter to indicate if case block should begin in a new lexical scope because it contains variable declarations. If you provide a PrimitiveExpression to Instantiate it becomes the label for case, otherwise, a default is generated. Although it is illegal to have more than one default or same case in a switch statement, these checks are more high level than any code dom. Therefore, it is the responsibility of client to ensure it does not happen, otherwise, deal with it when the compiler whines on the source code :). Also, there is no check to see if the PrimitiveExpression is really a compile time integral constant or not, you will have to face your compiler if you do contrariwise.

Similar to CatchClause, we use Case to build a higher level construct, SwitchStatement. A switch is essentially an expression which is evaluated at runtime and a bunch of case blocks crammed together in one lexical scope. You construct a SwitchStatement by providing an Expression and adding case blocks through Cases( ). An alternative to switch is the if-else clause. Sometimes there is no option other than if. e.g., if want to test for a range, want to check for a floating-point value, value is not a compile time constant etc. If-else clause is used as ConditionClause. You provide it with only the expression which is checked in if( ). Statements which will be part of if block are accessed through Statements( ), those which will be part of else block are accessed as a StatementBlock with Else( ). Condition can be retrieved with Condition( ). Special treatment for else block is necessary only because it is not possible to inherit from StatementBlcok twice directly. Therefore, else block is exposed as a StatemntBlock object and statements in if block are inherited through Statements( ).

All loops are clumped in the base class of IterationClause which inherits from StatementBlock. Since all statements of loop body are already accessible through Statements( ), IterationClause only introduces a Condition( ) to access loop condition. The class is abstract as there is no way to realize a loop without telling whether it is a for, while or do-while loop. A ForLoop specializes IterationClause with two additional expressions for ‘for-initialization statement’ and increment expression. If none of these three expression is given, an infinite for loop is generated. WhileLoop and DoWhileLoop specialize IterationClause to generate different forms of loop with a body and condition only.

Runtime Code Generation in C++ 5

Filed under: C++ — Tanveer Badar @ 3:34 PM

Sorry for such a long discontinuity in posts related to CppCodeProvider. I have been really busy and barely had time to do anything apart from my project. Till last time, we have discussed type support in CppCodeProvider. Also, I am sorry about the bout of C++ I am having recently. I really annoys me to stick to one topic too but it had to be finished sooner or later.

You cannot use a type without statements and expressions. Apart from types, a language is entirely built from expressions and statements. C++ has a rich support for expressions, which are the topic of this post.

Since it is possible to mix any type of expression with any other from compiler’s point of view, all expressions derive from one base class Expression. Let’s begin from the simplest of expressions, a literal expression. A literal is represented by PrimitiveExpression which is built from a string. Next, consider integral compile time constants. They are encapsulated by IntegralExpression. This expression is currently only used in case labels, although it is possible to use it in non-type template parameters too.

When adding namespaces and classes to the mix, we need some way to resolve an identifier to its scope, for this we need ScopeResolutionExpression. A ScopeResolutionExpression expression can take one argument, for global identifiers or two arguments; one string and second another ScopeResolutionExpression to refer to one particular scope. Then, consider how to reference a variable declaration in code, and for similar purposes argument reference. For these, ArgumentReference and VariableReference are provided which respectively allow you to refer to an argument or variable defined previously. We can also refer to a function for function pointers, so there is a MethodReference too. A MethodReference can refer to any function. Therefore, there are smarts built into writetext to handle the special cases of free standing and member functions.

In order to allocate objects on heap, you can use NewExpression and its corresponding DeleteExpression. A NewExpression can be an array new or a scalar new. It can allocate a multi-dimensional array variable. In case of scalar new, it calls the constructor with the given number of arguments. A DeleteExpression can be a scalar delete or an array delete, with the difference given by member function Array.

We also need casts to change from one type to another. Cast expression takes another expression which will be cast to another type, the target type and type of cast to apply to given expression. Possible types of casts are c-style casts, reinterpret_cast, dynamic_cast, static_cast and const_cast.

No discussion of expressions is complete without expressions with operators. A BinaryExpression takes two other expressions and an operator Type to apply infix to the given expressions. All types of binary operators can be given. Similarly, UnaryExpression allows us to apply a unary operator to some variable. There are two types of unary expressions. PrefixExpresison applies any of the prefix operators to the given expression and PostfixExpression applies any of the postfix operators to the given expression. It is possible to provide names of some operators instead of their types as defined in the standard.

We can change precedence of operators by applying parenthesis with the help of ParenthesizedExpression to an expression and producing a new expression. This class takes an Expression as argument and surrounds it with parenthesis. Another handy expression in the conditional operator. This operator is provided in the class ConditionalExpression. This class takes three expressions and uses the first one to decide between the last two expressions to presents as result.

To cover exceptions, a ThrowExpression is also provided. Since we can throw an object or rethrow a previous exception, it is possible to instantiate ThrowExpression without any argument to realize rethrowing a previous exception.

A function call qualifies as an expression. A MethodInvoke takes a reference to either a Function or a MemberFunction. Second argument is an Expression which will be used as the object reference for the MemberFunction call or in place of function name for Function. For the MemberFunction, it is possible to specify whether the object is a reference or a pointer. Argument list is accessible through Arguments( ).

Next time, it will be statements.

8 June 2007

A comparison of CPlusPlusCodeProvider to System.CodeDom

Filed under: C++ — Tanveer Badar @ 6:26 PM

I thought these were deficiencies in CppCodeProvider. Namely,

· It is very difficult to partially specialize a user-defined type.

· Union does not write anything at this point.

· It is not possible to initialize enumerators with values.

· A user defined type does not write anything at this point.

· Pointers to function, member function and member variables are very hard to work with.

· Friends cannot be declared at this point.

· Forward declarations are not possible.

But it turns out, some of these are feature which are really difficult to represent in any code graph for C++. CodeDom has much serious limitations, addressed here. A short listing is.

  • No unary operators like -,+,++,– etc.
  • No readonly for fields in a class/struct.
  • No break/continue in a loop/switch statement.
  • No type cast support in the form of as/is.
  • No while loop.
  • No switch statement.
  • Attributes on set/get of a property.
  • No default indexer in VB.Net.
  • Main always has void return type in C#.
  • No == operator.
  • No internal virtual functions.
  • No way to apply [field:] attributes to an event.

Reflection on Reflection: Function.Invoke/MemberFunction.Invoke

Filed under: C++ — Tanveer Badar @ 4:44 PM

Consider a language with no namespaces, no generic coding paradigms, just class hierarchy support. And for that matter consider no multiple inheritance to simplify things. To complicate things, functions can be overloaded but runtime polymorphosim is forbidden. Now, let your thoughts run free to the idea that all type information compiler found at compile time is also available at runtime through a special function GetType( ) on every class. This returns a Type object which you can query for all functions and call Invoke for a particular method on that object.

With this background, lets converse a bit about calling the functions. All you have available is the name of function to call and the meta data about the type. This function must perform these steps, in order.

  1. Validate its arguments.
  2. If it called on MemberFunction, search through the inheritance hierarchy for the function. Stop this procedure as soon as at least one matching function is found, otherwise, search only in the global scope.
  3. If no function was found in the last step, throw a notfoundexception.
  4. For all the functions in the found set, check their number of parameters and reject all those whose parameters are less than given.
  5. If no functions are left, throw an argumentmismatchexception.
  6. For the remaining functions, reject all candidates which take more parameters and do not have default values for extra parameters.
  7. If no functions are left, throw an argumentmismatchexception. By this time, we only have those functions left whose number of arguments are same or more with default values.

  8. Perform overload resolution among the candidates.
    1. Consider one function at a time. Perform positional type coercion for each parameter. Prefer implicit conversions and promotions as to user defined conversion. Also, user defined conversions must be discovered through meta data which will incur a performance hit everytime. Record the number of conversions required. Also perform coercion on return type. If return type can not be converted, reject that function.

    2. Those functions which require the minimum number of conversions are the best candidates.
  9. If no functions are left, throw an argumentmismatchexception.
  10. If we are left with more than one function, through an ambiguousmatchexception.
  11. If the function can not be accessed by the caller, throw an uncallableexception.
  12. Perform type coercion on parameters for the candidate function.
  13. Perform type coercion on return value of the candidate function and return it to caller.

If we allowed more generality than what was excluded, the conditions would have been much worse.

  1. With namespace, there is single global scope. You must have a Type object for each namespace. Overloads can be found at any level of namespace hierarchy too.

  2. With generic code, there is no way to instantiate a new class each time you specify parameters for which a concrete instantiation does not exist. Therefore, an additional check must be done to see that arguments match the type of object that are called on. Also, those arguments which participate in generic type deduction cannot have type coercion applied to them. A generic type argument inferable through multiple parameters must be same for every individual inference.

  3. You want multiple inheritance, only more work for Invoke. It has to search all branches of base classes simultaneously. If the function was declared in multiple base classes, it should throw an ambiguous match exception. Therefore, it must construct a tree rooted at the type of calling object and traverse it depth first to find the candidates and perform overload resolution among them.

  4. If someone thought virtual functions are a must, then, how are you going to invoke the most derived implementation when all you have is only a pointer to the function implementation at most till this level of inheritance. They must be invoked through the exact reference to the calling object and no other way.

6 June 2007

Reflecting on Reflection

Filed under: C++ — Tanveer Badar @ 5:50 PM

Don’t tell me you never did things like obj.GetType( ) or typeof( obj ). Everyone does, admit it. But have you ever thought what goes behind the scenes of all this raw power? If you had to implement such a system yourself what design decisions would you make?

Now that you have admitted about typeof( obj ), tell me if you ever wondered who moron wrote this enumeration?

[Flags] 
System.Reflection.BindingFlags 
{ 
    CreateInstance, 
    DeclaredOnly, 
    Default, 
    ExactBinding, 
    FlattenHierarchy, 
    GetField, 
    GetProperty, 
    IgnoreCase, 
    IgnoreReturn, 
    Instance, 
    InvokeMethod, 
    NonPublic, 
    OptionalParamBinding, 
    Public, 
    PutDispProperty, 
    PutRefDispProperty, 
    SetField, 
    SetProperty, 
    Static, 
    SuppressChangeType 
}

Why are all these access control flags mixed up with things like overload resolution and instance/static methods? Where does all the meta data about a type go? (Well in the meta data dictionary! Where else would it go?) Does MethodInfo.Invoke perform overload resolution? How are arguments coerced if the types don’t match exactly? Why do we seem to have a separate class for almost every lexical scoping construct? Why can you do this if you have proper access

class a 
{ 
    int member; 
} 

typeof( a ).GetField( "member" , BindingFlags.NonPublic | BindingFlags.GetField ).GetValue( );

but not this

class a 
{ 
    int member; 
} 

a obj = new a( ); 

Console.WriteLine( a.member );

Enough questions, lets discover the reasoning behind them in CLR. First the why be a moron? That moron made a really good decision, all these things would have required a separate slot in the class, just clump them together in a bit field to save considerable space. The meta data goes in a meta data dictionary in your assembly. Reflection APIs read it from there. Type size is essentially reduced.

And did you know that in IL you refer to an entity by its ordinal in some table. Want to call a method? Emit a call instruction on the object/class with the argument equal to its slot. Instantiating some class? Emit a call to newobj with the ordinal of that class in the type table.

Method.Invoke is a big machine. It has to search all the methods which match the name depending on being case sensitive or not. Then, it must find the best method from that set using overload resolution rules which the compiler uses at compile time. Quite a bit of work for one function. And overload resolution involves type coercion and parameter matching from what was given to what is required.

All these classes are provided to match the language features. Same kind of effort goes into three places. Compiler for the language, a runtime code generation system which has a similar class hierarch and an even complex object hierarchy and the type discovery system which must match compiler’s implementation to support every lexical construct in the language.

Access to private members is allowed if you have proper permissions. Consider it from the compiler’s point of view. If it sees the second case, name lookup check succeeds but accessibility check fails. Now, consider the reflection case. “member” is just a string argument to some function for the compiler. The meta data is already available for anyone to use if they care to. Therefore, if you can get your hands at the meta data and have appropriate permissions, you can access private implementation specific parts too.

I encountered all design problems because I am writing a reflection framework for C++. The language natively supports one joke and a work around. The joke is called typeid( ) operator and the work around is dynamic_cast< >( ) operator.

Considering the modern needs of runtime discovery of types, plug-in architectures, design patterns like IoC we need a strong type system and an equally strong runtime support system for type discovery and dynamically invoking members of these types. I call typeid a joke because it return type_info and there is no requirement that this type_info contains valid (don’t even think about things as high as useful) information for the object it was invoked. name( ) function may return an empty string, if a non-empty string was returned it may not necessarily correspond to the compile time name of type. You can do nothing else with a type_info apart from the name( ) and before( ) functions. before( ) does not order types lexicographically, the details are hidden from mere mortals (read programmers).

dynamic_cast is a trial and error game. You have a pointer to some base class and it is your burden to find out which exact derived class object it really is by repeatedly down casts. If you have a reference, conditions are much worse for you as your first cast must succeed otherwise you get a bad_cast exception. If you have multiple virtual base classes, dynamic_cast is the only hope, static_cast is forbidden.

Boost goes a little further than that primitive state of affairs. They provide a typeof operator which allows you to infer type at compile time. gcc also has a typeof operator which works similar to boost’s version, i.e., compile time inference of a type from some expression, nothing better than that. And don’t get me started about VC++. They are slow enough to get their partial specialization correct after five years and dependent name lookup is still messed up.

For my reflection framework, I have chosen to implement access control and declaration specifiers as bit fields to save space. Consider adding a bool for things like public, private, protected, pointer, reference, constant, volatile, template and extern or packing them all in one int. One bool for each results in 36 bytes of additional storage for just 9 bits of information which will easily fit in a four byte integer.

Dynamically invoking functions on a type must have overload resolution because C++ supports overloading. Arguments must be converted to correct type because compiler does that at compile time. In short, every aspect of function call resolution that happens at compile time must happen for invoking where possible. Things like argument dependent lookup is possible only for base classes, there is no way to influence namespace level lookup or introduce new identifier with using declarations. Similarly, template argument inference is a hard thing to do at compile time. The thing is Turing Complete at compile time, only one front end (sold only to big names like Microsoft) and an open source compiler implement them correctly, I am not going to burst my brain over it.

For the type hierarchy used in reflection framework, I have ReflectionObject, Type, Function, MemberFunction, Field and Parameter classes. Type class is abstract and my model for reflection has each class define a private implementation of Type and return the static object of that private type when GetType is called. Also I require a type to implement a static StaticGetType which returns the contained object without first creating an instance.

Since Type class contains complete information about a class/struct it will possible to access private implementation of a class if appropriate access is requested.

4 May 2007

Runtime Code Generation in C++ 4

Filed under: C++ — Tanveer Badar @ 10:56 PM

Consider a world of programming without the ability to define types. Without types, OO programming crumbles to ashes, C#, Java and Visual Basic do not even exist. C++ becomes a super set of C with only exceptions and template functions as a major language feature. Template meta programming in C++ is no longer possible. All you can do in .Net framework is work with C or write your main function in IL and find some functional language to work with. Writing even a single line in SmallTalk is impossible as the act to declaring a class is a message to Class object to define that class. Even in C, you cannot play with structures, enumerations and bit fields, because in essence, they are all user defined types. Functional languages rise to the surface, Google’s map-reduce is seen as the solution to every problem.

But fortunately, this was only a nightmare. We do live in a world with happy classes, structures, enumerations, unions and a decent meta programmable language. From a compiler writer’s point of view, the notion of meta programmability is enough to give him nightmares, no decency on his part. Before we start discussing the classes present in CppCodProvider for encapsulating types, I should point out that some of them still have incomplete implementations. There are so many things left out that they merit discussion before we do anything else.

· It is very difficult to partially specialize a user-defined type.

· Union does not write anything at this point.

· It is not possible to initialize enumerators with values.

· A user defined type does not write anything at this point.

· Pointers to function, member function and member variables are very hard to work with.

· Friends cannot be declared at this point.

· Forward declarations are not possible.

With all that said, let us move on to the types contained within. Perhaps the simplest type is the Enumeration. An enumeration is a set of named constants that have integral values, either explicitly assigned or implicitly generated by the compiler. The peculiarity of C++ enumerations is that they do not define a lexical scope. Enumerators appear in the same lexical scope as the enumeration itself. It is initialized with a name (optional) and a collection of enumerators.

Next up, it is time to represent a built-in type using BasicType class. This class checks that the name provided in indeed just a built-in type and nothing else, otherwise, its constructor throws a std::invalid_argument exception. A Union encapsulates a union in code and provides access to its member variables, functions and operators. These additional attributes are only present if a name was given to Union’s constructor because unnamed unions can only have member variables.

To represent parameters in a template declaration, TemplateParameter is used as the base class. Three different classes derive from it to capture three different ways of declaring parameters in a template. They are NonTypeParameter to represent a built-in type, TypedParameter, to represent an arbitrary type and TemplateTemplateParameter to represent a template template parameter in the declaration. NonTypeParameter has one additional method IsIntegral for the fact that a non-type parameter can be an integral type or a pointer/reference to floating point types.

The big kahuna for types is the UserDefineType. This represents either a class or structure in code. A number of members are presents: member functions; member operators; member typedefinitions; member enumerations; member variables; member types; constructors; and an optional destructor. Except for the destructor, all these members are exposed as homogenous collections that can have any number of items in them. Additionally, a UserDefinedType can have any number of bases and templates parameters. Base classes are specified using BaseTypes member function which returns a collection. Similarly, template parameters are accessed through TemplateParameters which returns a TemplateParameterCollection. If the user-defined type has some template parameters, it is also possible to partially or completely specialize it on those parameters. Some utility functions let you investigate whether the type is abstract; has template parameters; is specialized if it has template parameters; has its constructors declared private, in other words is sealed.

Now, let me show you the flip side of the coin, the process of instantiating a user-defined type. If it has template parameter, you have to supply template arguments to it during instantiation. This taken care of by instantiate and its overload which takes a StringCollection& as an argument. Also, you need fully elaborated name of a user-defined type if you define any member function out of class. For this purpose, UserDefinedType has writeelaboratedname in its public interface.

A member user-defined type needs some additional processing before it can be written, such as: if it is being written out of class, it needs to include <containing class elaborated name>::<member type name> before its definition; if the containing class is one of the bases, member user-defined type must be defined at the same lexical scope as the containing class; if both classes have template parameters, they cannot be merged but must be written in two parts. This means that the public interface of MemberType is same as that of UserDefinedType, but the internals are very different. There is a lot more going on behind the scenes than what appear in front of the veil.

Finally, two more classes worth mentioning are the FunctionPtr and the MemFun to represents pointer to a function and a member function respectively. Note these classes are going revision. their interface and representation are still not decided. There is also a class PointerToMember but it is so incomplete there is no point further discussing it.

27 April 2007

Runtime Code Generation in C++ 3

Filed under: C++ — Tanveer Badar @ 9:40 PM

[Update 27/04/2007, this post has been overdue for almost two weeks now.]

As I promised that I’ll be discussing Function and its descendants in this post, here are the tiniest little details about them. A Function represents a free standing, translation unit level function. It has a name, a return type, a set of parameters, an optional exception specification, a possible function level try block, and most importantly, a set of statements. These details also apply to all of its descendants, with a few exceptions. A function can be inline and can have template parameters in its declaration.

Public interface for a function is fairly simple and most functions are either get/set style or return a reference/pointer to some internal representation detail. The real fun resides with the protected interface to the class. It begins with the call to writetext. This function first writes out all the template parameters if any using writetemplateattributes. Then, if the function is declared inline, inline is written to the output stream. Then, if there is a return type specified, first its declarator is written followed by its declarator specifier (omitting the name of variable). Next, we write the name of function. This function is virtual because constructors and destructors do not have separate names. After writing the name, we move on to the parameter list and write it using writeparameters. This function calls writeasparameters for each parameter in VariableDeclaration collection. Now, it is time to write exception specifications if any using writeexceptionspecs. This function first checks whether the target compiler supports exceptions specifications and if the result is false does nothing, otherwise, writes every type name in the given list. Next, we write try for function level try block after checking the result of FunctionTryBlock. This function returns true only in the case if the target compiler supports function level try blocks and this function has associated catch clauses. With everything done, it is time to write the body of function using writebody. writebody iterates through all the statements and writes them one at a time. Finally, if FunctionTryBlock returned true earlier, we write the associated catch clauses.

This discussion left one important piece of functionality away. The need to write a function declaration in header files and in class definitions. We do this with the help of writedeclaration which calls the virtual function declaration. This function writes all the template parameter, the return type if any, declarations of all parameters and exception specifications if any.

Next important class is the MemberFunction. A MemberFunction represents a class member function. A separate class is necessary because it has a set of attributes like static, virtual, const, volatile, pure and accessibility. Some of these are mutually exclusive or imply others, like a static function cannot be virtual, const or volatile; a pure function implies virtual; virtual excludes any template parameters. All the dirty work is handled in get/set functions for its attributes and writetext works almost like that for a function, with the additional work of writing template parameters of its parent class and writing attributes like static, virtual, const, volatile and = 0 where applicable. declaration also works similarly and inserts additional information where necessary.

Constructor represents a class’s constructor. It has two additional attributes. Get/set functions for explicit and a list for constructor initializers. writetext does the extra work of writing explicit before inline if necessary and writing constructor initializers right before writing the body of constructor. declaration changes to accommodate the facts that a constructor has no return type and has the same name as its parent class.

A Destructor has two additional attributes apart from those of Function for representing its virtual and pure properties. Its writetext is very simple and only writes the template parameters of parent class, name, exception specifications, function level try block and body of destructor. declaration is even simpler, it write template parameters for parent class, name and exception specifications.

Finally, we consider syntactic sugar, Operator and MemberOperator. An Operator is just a function with the changes that it does not have a name; its name returns the textual representation of the operator it overloads and has one additional attribute that returns the operator type. It does not override writetext or declaration because overriding Name does the job. A MemberOperator derives from Operator and adds additional attributes just like MemberFunction does to Function, with the noted exception that a member operator cannot be declared static. writetext and declaration for MemberOperator work just like for Operator except that it adds the template parameters of parent class before anything else.

Type‘s family is already squeaking in my ears to go out to public.

23 April 2007

Finally some XPS in native C++

Filed under: C++ — Tanveer Badar @ 2:16 PM

I just read this and haven’t checked it myself. Use with care and beware of security vulnerabilities in zlib.

18 April 2007

Start getting dirty with Visual Studio Orcas

Filed under: Bugz, C++, Orcas — Tanveer Badar @ 11:42 AM

I got my hands on VS Orcas March CTP yesterday and tried compiling CPlusPlusCodeProvider on which I have been working for five years. The code has compiled and worked flawlessly on all previous versions of VC++ [VC++ .Net to VC++ 2005], but now the new compiler reports an error on assignments lines involving auto_ptr objects.

Whenever I have tried to do <object> = new <type>( *other.<member> );, this new compiler whines. Compiler gives error about ambiguous assignment operators. Says it cannot choose between std::auto_ptr< Inner >::operator = ( const std::auto_ptr< Inner >& ) and std::auto_ptr_ref< Inner >::operator = ( const std::auto_ptr_ref< Inner >& ).

When I looked at the source for auto_ptr none of the classes had assignment operators defined, so the compiler is generating the default implementations. However, they both have a public constructor declared like this auto_ptr( T* ptr ) and auto_ptr_ref( T* ptr ).

I don’t remember if these converting constructors should cause any ambiguity. Converting constructor for auto_ptr is more qualified than auto_ptr_ref. It should be the one that handles this conversion, instead of creating any ambiguity.

As I remember the standard, you cannot create an instance of auto_ptr_ref directly, therefore, this constructor should not be declared public but protected. If declared public, it should have been declared explicit in the first place. Moreover, in the library shipping with VC++ 2005, auto_ptr does not derive from auto_ptr_ref. It has just a converting constructor.

I know VC++ team does not write the libraries themselves, instead it is Dinkumware that should be blamed, if it is really incorrect code.

« Newer PostsOlder Posts »

Create a free website or blog at WordPress.com.