Saturday, March 29, 2008

What's In A Name?

This question was once asked by an obscure poet. While I'll not attempt to answer that exact question here, a related issue does arise in the context of JavaScript namespaces and the things that are placed in them.

You see, normal functions have names (sort of):
function foo () { }

In Firefox, one can even get foo.name to find the value "foo". Why would one do this you ask? There are several useful applications for such information. For example, we can walk the callstack:
function stackdump ()
{
var a = [];
for (var f = arguments.callee.caller; f; f = f.caller)
a.push(f.name);
return a;
}

But this breaks down with $namespace as we currently have it. The problem is that $namespace copies functions from an object literal to the target namespace. Functions declared in this manner have no name:
$namespace("foo.bar", {

func : function () { }

});

So, while user code knows this function as foo.bar.func, the JS function object does not know its own name. My first idea was to set the name property in the copy loop. The name is known at the time the function object is copied after all.

There are really only two problems with my first idea: it didn't work and it really wouldn't have helped anyway. Either of these alone would probably be enough to put an end to the pursuit, but it is worth a bit of explanation. For the first problem, Firefox won't let you set the name property. Second problem: the name property is only present in Firefox (like I said above), so providing it only in this case does not give a universal answer for the name of a function.

What is needed is a way to solve the problem of name for every function object. So how can we make something available to all function objects, even ones we've never encountered before? The answer lies in the magic powers of JavaScript prototypes.

The prototypal inheritance nature of JavaScript makes it one of the very few truly Object Oriented Programming Languages. Most "OO" languages like C++ and Java use the concept of a class as distinct from an object. The class serves as a kind of static, built-in indirection mechanism. All objects of a given class indirectly reference their class definition. In Java, the class has a run-time presence in the form of an object of type Class. These class objects provide limited, read-only access to the class declaration at run-time. In C++, there is no real run-time representation of a class at all (even given the so-called RTTI feature of the language).

Background: In JavaScript, every object implicitly points to another object called its prototype object. Whenever an object property is requested via the "." operator, that object is first examined to see if the property exists (as one would expect). Then things get interesting. If the property does not exist, the prototype object is also searched for the property. This repeats until the prototype chain is exhausted or a value is found, whichever comes first. The prototype for an object is set when the object is created using the new operator. Since there are no classes in JavaScript, functions serve as constructors. When an object is created using a particular function as a constructor, that function object's prototype property becomes the newly created object's prototype object. Note, the prototype object is not stored as the prototype property of the new object. The prototype for an object is stored internally and is not (portably) accessible. There are several pre-defined constructors: Object, Array and Function being the most useful. All objects terminate their prototype chain with Object.prototype. All array objects pass through Array.prototype and then on to Object.prototype. Likewise, all function objects have Function.prototype followed by Object.prototype in their prototype chain. In other words, we can add properties to all objects by adding to Object.prototype.

Before you get tempted to run out and add stuff to either Object.prototype or Array.prototype let me warn you of the EVIL that will ensue if you do so. Consider the following for loop:
for (var x in [ a, b, c ])
...

The author's intent is fairly clear. The definition of for (x in a), however, is such that it will iterate over all properties, including those added to Array.prototype and/or Object.prototype! Shudder. As you might imagine others have written about this issue. The best one can say is that modifying these prototype objects produces code that "doesn't play well with others". My advise is that for your own sanity, but more for those who use your code, stay away from this technique.

With that rant out of the way, I feel like I can move on. Fortunately for us, we can modify Function.prototype and have none of these problems. We need two methods:
// called as function objects are added to a namespace via $namespace:
Function.prototype.$namespace = function (ns, name)
{
var fn = ns.$namespace.fullname ? (ns.$namespace.fullname + ".") : "";
this.$meta = { fullname : fn + name, name : name, namespace : ns };
}

Function.prototype.getName = function ()
{
if (!this.$meta)
{
var name = this.name || this.toString().match(/function (\w*)/)[1];
name = name || "~anonymous~";
this.$meta = { fullname : name, name : name, namespace : $namespace.global };
}

return this.$meta.fullname;
}

If you recall, the $namespace fuction calls $namespace if it is a function property of the copied object. That is to say, if we provide a $namespace method, it will be called when/if the function is added to a namespace.

After all that then, the method getName is now defined for all function objects. Here now is the portable way to get the name of a function object:
function foo () { }
var s = foo.getName();

And the modified method to walk the callstack:
function stackdump ()
{
var a = [];
for (var f = arguments.callee.caller; f; f = f.caller)
a.push(f.getName());
return a;
}

Sunday, March 16, 2008

More JavaScript Joy

Before I jump in to the promised $namespace function, let me introduce you to a close friend of mine:
function $panic (msg)
{
if (!console)
alert("PANIC: " + msg);
else
{
console.error("PANIC: " + msg);
if (console.open)
console.open();
}
}

This little function and I have been friends for some time. The Firebug API for logging errors grew on to the basic alert statement many moons ago, and I try to leave a Firebug breakpoint set right inside. Very handy. If you think this only applies to Firefox, you should look at Firebug Lite.

Anyway, here's the $namespace function:
// Forms:
// #1 (ns, {});
// #2 ("...");
// #3 ("...", {});
// #4 (ns, "...");
// #5 (ns, "...", {});
function $namespace ()
{
var sub, ns = arguments[0], pos = 1; // form #1
if (typeof(ns) == "string") // forms #2 & #3
{
sub = ns;
ns = $namespace.global;
}
else if (typeof(arguments[1]) == "string") // forms #4 & #5
{
sub = arguments[1];
pos = 2;
}

if (sub)
{
var parts = sub.split(".");
for (var i = 0; i < parts.length; ++i)
{
var s = parts[i], fn = ns.$namespace.fullname;
if (!ns[s])
ns.$namespace.children.push(ns[s] =
{ $namespace : { children : [],
fullname : (fn ? (fn+".") : "") + s,
name : s, parent : ns } });

ns = ns[s];
}
}

var add = arguments[pos];
if (add)
for (var name in add)
{
var v = add[name];
if (ns.hasOwnProperty(name))
$panic("Namespace "+ns.$namespace.fullname+
" conflict with '"+name+"'");
ns[name] = v;
if (v && v.$namespace)
v.$namespace(ns, name);
}
}

$namespace.children = [];
$namespace.fullname = "";
$namespace.global = function () { return this; }();
$namespace.name = "";
$namespace.parent = null;

There are several points worthy of some discussion in the above code. The first thing beyond the mundane usage form is that all namespaces contains a $namespace property. This property holds a few key pieces of meta-data for the curious. The name and fullname properties are pretty self-explanatory. The parent property stores a link up the hierarchy. The children array contains the list of child namespaces. This can be convenient since child namespaces are stored in their parent namespace alongside all the other stuff (like functions).

I decided to store all this meta-data in a $namespace property for a couple reasons. First, I wanted to reduce the probability of name collisions much as possible. I mean, it is the user's namespace after all. Second, this was symmetric with the presence of $namespace itself in the global namespace. In fact, you can see several properties added to the $namespace function for this purpose.

Something rather cute is the $namespace.global property. That odd little function is one way to reference the global scope. You could ask "why not just use 'window' and be done with it?" The answer is that by doing it this way, this code will work in non-browser contexts (such as the Rhino JavaScript engine). My $panic function is probably browser-specific. That is unless the Java-side were to publish console and/or alert methods, which is a nice way to go if you're using Rhino.

The last item worth talking about is in the loop where members are copied to the new namespace. Other than checking for name collision (a good idea to be sure), there is a detection and delegation step: the loop inspects each object for a $namespace property and, if found, invokes it as a function passing the namespace to which the object is being added as well as the object's name. The reason for this will become clear in a future post.

Well, that's it for now. Enjoy!

Monday, March 10, 2008

The Joy of JavaScript

I really like JavaScript (JS). The language strikes a great balance between simplicity and expressiveness. Basically, JS is an awesome little language, especially since the addition of object and array literals (JSON)! I have very mixed feelings about the little I've seen regarding the next major version of JS/ECMAScript. If ever it is available. And widely supported enough to be useful. Let's just say that I'm not holding my breath.

Maybe the new JS will solve a lot of problems (I fear it will be mostly noise added to an otherwise simple language). The problem many of us suffered with for far too long was not the JS language, but the environment: the browser! To some extent that remains true, but with the introduction of Firebug and the Firefox Error Console, JS had come into its own. Now that there are real tools for developers, JS is much more useful. Even the IDE's are recognizing .js files and not just syntax highlighting them, but also finding syntax errors – the bane of JS in the old days.

That said, there are a few things that other languages bring to the table that would be very helpful for JS development in the large. As I see it, the most important things missing from JS are:
  • Namespaces
  • Modules (JS loading JS dynamically)

Others may see things differently, but these are both essentials in my book. I see class-based inheritance as “nice to have”, but I'll try not to digress into that just yet.

The problem with the above list is that only one of these is actually implementable in the language, so I'll focus on that one in a moment. The other one really needs language support to work properly.

In case you haven't guessed it yet, namespaces can be simulated using existing language features. I stress “simulated” because namespaces in C++ and packages (their analogue in Java) are a compile-time scope resolution mechanisms. The goal of a namespace is to reduce name collisions primarily in the global scope. Since JS has no compile-time per se, we'll have to live with what JS can provide: name lookup.

In Java, a fully specified name looks like:

mypkg.subpkg.name

We can achieve this same syntax if “mypkg” were an object containing another object named “subpkg”, etc.. You get the idea. The issue is that namespaces need to be open to extension. This means that the following JS is not quite there:
var mypkg = { subpkg : { name : value } };

It may create the structure above, but what about adding another symbol? If the above approach were used:
var mypkg = { foo : 42 };

We overwrite the previous content of mypkg – not our goal! What is needed is a helper function that can be used the same in both places, that will create the namespace if it does not exist and will not overwrite it if it does. Something like this:
$namespace(“mypkg”); // create mypkg if not already defined
$namespace(“mypkg.subpkg”); // add “subpkg” to mypkg if not already defined
$namespace(mypkg, “subpkg”); // add “subpkg” to mypkg if not already defined

$namespace(mypkg.subpkg, // add items to mypkg.subpkg
{
name : value
});

$namespace(mypkg, “subpkg”, // if mypkg is known to exist
{
name : value
});

$namespace(“mypkg.subpkg”,
{
name : value
});

I've chosen a style that I call a “pseudo-keyword”. The “$” character is not widely used in code, is not used in the standard JS objects or methods and gets your attention like a keyword should. I'll post an implementation of this somewhat flexible method next time.