New filters on the Home Feed, take a look!
view details

This post has been de-listed

It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.

20
Any love for early binding?
Post Body

I'm designing a statically-typed compile-to-JavaScript language, like a better TypeScript, based on looking at the patterns in my code that are a pain to express in TypeScript or that require sacrificing performance for flexibility or readability. The type system will be simple enough to have a spec, and nominal (like Java), but "modern." Static type information should allow more compact runtime representations (e.g. "unboxed" values, typed arrays) and other performance features that avoid allocations (like reusing a cursor object in a loop without it being aliased).

I'm realizing that I really like the concept of early binding. I want my language to support a mix of early and late binding, but I want guaranteed-early binding to be a thing. Does anyone here feel similarly drawn to early binding, or have resources in praise of early binding? I love Alan Kay, but the Internet seems plastered with praise for late binding, with many commenters saying the only benefit of early binding is performance, which is increasingly a non-issue because computers are fast.

It's true that I've written a lot of framework/engine code in high-level languages like Java and JavaScript, and I care more about performance across the board than your average programmer, but I really think that people would not be needing to rewrite so much code in languages like C and Rust (e.g. compiled to WebAssembly) if they were able to write code in such a way that heap allocations were fewer and smaller, and function calls tended to be monomorphic/statically known.

At a conceptual level, in thinking about what it would take to make call sites generally monomorphic instead of polymorphic, I am realizing how much superfluous polymorphism/dynamism is assumed in a language like JavaScript—and, more generally, assumed by the late-bind-all-the-things OOP philosophy—and it's also pointing me towards interesting insights.

For example, suppose a module Foo has a class Foo that needs to create a set of integers as part of the workings of its implementation. There are different ways to represent a set of integers in JavaScript, for example a built-in Set, or you could use a sorted array of integers, or a sorted typed array (e.g. if the integers are between 0 and 255 you could us a Uint8Array). Suppose the integers are small and memory is at a premium (we'll have a whole lot of Foo instances), and Foo is a single-purpose-enough module that we can just pick one way of representing the set of integers, like that Uint8Array strategy. We'll still want to put that integer set implementation in its own module, and maybe have it implement some kind of interface or protocol like `Set<Integer>`. This makes the code to Foo readable, and it encapsulates the implementation of our integer set, and we can have unit tests for it. Since we are going to all this trouble to implement a set in terms of an array (even it just takes an hour or something to write the code and tests), we might as well make it reusable for the next time we are in this situation. In addition, if we change our mind about how Foo should store its integers, or the requirements change, we can implement a different kind of integer set and easily switch back and forth between two implementations with the same interface.

However—and this is the point—the purpose of having different implementations of an interface here is not so we can do some kind of late-bound dynamic dispatch. The goal is not that every time a Foo adds an integer to a set, the runtime goes, "Hmm, how interesting. Well, there are different kinds of integer sets, and different Foos can have different kinds, so I'll have to see how to do that in this particular instance." Runtime polymorphism is not needed to motivate the abstract and encapsulation here. If our type system doesn't even let us have a heterogeneous collection of different kinds of integer sets, or different kinds of Foos that use different kinds of integer sets, that's actually fine. We are just looking for design-time (compile-time) configurability and interchangeability.

I've done some benchmarking of V8 in monomorphic and polymorphic cases, and monomorphism does make a big difference, for example a function `foo(x)` that calls `x.bar()` where `bar` resolves to different function instances for different `x` will be much slower than if `bar` is always the same (or just calling `bar(x)`). Runtime monomorphism facilitates optimizations like inlining. In fact, in some cases it is worth writing the source code of the function `foo` twice if it means that you have two monomorphic functions instead of one polymorphic function!

So if you have some noun and want to perform a verb on it—suppose it's some custom data structure called "List" and you want to get the "length"—the best thing for the JavaScript engine is if you just call a function like "listLength". Through some combination of namespacing and static types, we could let you write that as `List.length(L)` or `length(L)` or `L.length`, but with early binding, it would compile down to a simple function call, not some kind of property access or dispatch.

Having static types affect behavior (function/method binding) is a rare thing to see these days. It doesn't happen in TypeScript, for obvious reasons. In Haskell or ML, it is also verboten for types to affect behavior, because you are not supposed to have to specify any type annotations; the type system is designed around the goal of whole-program type inference (which is a non-goal for me).

I think statically binding function calls to code makes a lot of sense, though. It enables language features I've been wanting to see for a long time, like being able to have an "object" whose representation at runtime is a primitive. (I think Scala can do something like this.) Not having the type of a value always be able to be distinguished at runtime affects the ability to have things like untagged unions. But it seems worth it. I also like the idea of the compiler sometimes passing around the "class" of a value as a hidden argument at runtime, for when polymorphism is necessary. In other cases, the compiler might actually propagate a type parameter through function calls and duplicate the source code, template-style. The duplicated code might be identical—in which case it still could benefit the JavaScript engine by being monomorphic—or it might have different inlined—or macro-expanded—code based on the differently instantiated type parameters.

If you did need differently-configured Foo instances, or Foo instances that access each other's internals, or some FooManager that "owns" a bunch of Foo instances and only operates on Foo instances it owns... I think this could all be tracked by the compiler. (Scala can make sure instances that are associated with different "parent" instances aren't used interchangeably, but as much time as I've spent reading about Scala's type system and calculus, I don't think it's what I need.) Modules, instead of being singletons like in TypeScript, would have parameters and type parameters that form the "configuration" or context of the module. Type parameters would probably be generally at the module level instead of the class/instance level. Hopefully this might even simplify type checking somehow.

The philosophy is, maximize reusability of code, especially core data structures and any sort of complex algorithm or logic, not by using dynamic dispatch, but by using abstraction that is made concrete when the modules are compiled. Maybe this was the idea with C .

The exploration of this topic has also given me insight into how OOP seems to be misused for situations where you want to define a type of data and a bunch of operations that can be performed on that data. Like many programmers, probably, I first learned about classes (this would have been over 20 years ago, btw!) as bundling together data, in the form of fields, and functions that do something with that data, in the form of methods. Later, I heard that OOP was originally about message-passing, or more fundamentally, late binding. Many of us have probably also experienced taking code that was in an OOP style and rewriting it in a more FP style, with immutable data operated on by functions organized into namespaces, rather than classes, and found the result is much better. There are limits to OOP, whether we're talking original OOP or what is conventionally called OOP now. In Smalltalk, I believe you literally add the numbers 2 and 3 by sending an "add 3" message to 2. Maybe that's just a matter of framing, but to me, it seems to make it harder to write a good compiler or reason about the behavior of code.

In cases where I do want a heterogenous collection of things with behavior—like a tree of stateful UI widgets—passing messages around (calling methods) that may lead to different behavior by different receivers does make sense. Even then, the ability to subclass and override methods—even though I keep being drawn back to it to share behavior between different kinds of stateful objects—doesn't seem right for most aspects of a widget. A widget tree should be designed in a more entity-component-system way, I think. If it's desirable to have an object like `widget` that has identity and a whole grab-bag of properties and methods like `widget.show()` and `widget.name`, and different widgets have different methods, like `penguinWidget.setPenguin(...)`, I'm sure there are ways to do that besides C /Java-style inheritance. One should be able to import/export behavior into a type (similar to traits/multiple inheritance but from a more modules/FP angle).

If you made it this far, thank you for reading. I'm curious what thoughts or resources this brings up for people.

Author
Account Strength
100%
Account Age
16 years
Verified Email
Yes
Verified Flair
No
Total Karma
11,627
Link Karma
1,087
Comment Karma
10,205
Profile updated: 1 week ago
Posts updated: 2 months ago

Subreddit

Post Details

We try to extract some basic information from the post title. This is not always successful or accurate, please use your best judgement and compare these values to the post title and body for confirmation.
Posted
1 year ago