Typed Architectures: Architectural Support for Lightweight Scripting

Presented on April 25, 2017
Presenter: Michael

Preview

After lengthy mental wrestling and lab deliberation, I’ve decided to present the paper “Typed Architectures: Architectural Support for Lightweight Scripting” this Thursday. The paper is attached. Its main point is to extend an ISA to track types in hardware and to make checking types during runtime easier/quicker; their motivation is that resource-constrained boards which provide dynamically-typed scripting language support take too many instructions to do this type checking.

Michael

Paper Summary

(Note: there were slides created for this reading group, which also serve as a decent summary of the talk and cover what we went over; they are attached below.)

The authors’ goal is to make scripting on resource-constrainted IOT devices more efficient, because these devices normally don’t have enough memory for JIT compilation, and a large portion of execution time is spent on dynamic type checking. Their solution is to create a new architecture that reduces dynamic instruction count while being cheap (minimum chip area and power usage increases) and flexible (support multiple scripting engines).

The high overhead of dynamic type checking involves tag extraction (figuring out the type of the value to which a variable is bound), tag checking (guards), and tag insertion (when storing a value back into a memory). They do some profiling to find that fewer than 10 bytecodes dominate the total number of bytecodes used in their benchmark test programs, and they choose to optimize on ADD, SUB, MUL, GETTABLE, and SETTABLE (in Lua; Javascript has semantically-equivalent bytecodes differently named).

They extend the ISA by changing the register file (adding a type field and F/I field); adding tagged ALU instructions (xadd, xsub, and xmul), which allows they to perform type-checking in parallel (i.e. same time in the pipeline) with value calculation; and adding tagged memory instructiosn (tld and tsd). They also add a few more registers (R-hdl to store the label to jump to on a type-misprediction) and R-offset, R-shift, and R-mask in order to make tag lookup/storge more engine-agnostic.

In their evaluation, they use an in-order, 50 MHz, 5-stage pipeline simulated RISC-V processor, and they achieve 9.9% and 11.2% geomean speedups in Lua and SpiderMonkey, respectively. They also compare their results against a similar approach, called Checked Load, which achieves 7.3% and 5.4% speedup, in Lua/JS respectively, showing they are better than the current state-of-the-art. They attribute their improvements to reduced dynamic instruction count and reduced pressures on the branch predictor, instruction cache, etc. They also show, but aren’t too clear on the specifics, that they achieve these benefits with out 1.6% increase in total chip area and 3.7% increase in power usage.

Finally, in their extended work, they cite hardware-based type-checking accelerators like done on the LISP machines; hardware-supported metadata processing (taint tracing, memory bound checking); and JIT-based type specialization (Madhukar’s paper on static then profile-guided type inference). They argue that these “[p]rofile-guided dynamic compilation techniques are not a viable option on emerging IoT devices due to the high cost…”

Lab Discussion

The paper was very straightforward, so there wasn’t much of a controversial nature to discuss. There was a question about their metrics used to report power usage and area increase, but I didn’t feel the paper did a great job detailing this particular point. I would be interested in reading more about how their techniques really differ from LISP machines (they cite a few papers), and additionally finishing to read Madhukar’s paper that they cite.

Attachments