Skip to content

BEAM-native JS interpreter#5

Open
dannote wants to merge 32 commits intomasterfrom
beam-vm-interpreter
Open

BEAM-native JS interpreter#5
dannote wants to merge 32 commits intomasterfrom
beam-vm-interpreter

Conversation

@dannote
Copy link
Copy Markdown
Member

@dannote dannote commented Apr 15, 2026

QuickJS bytecode interpreter running natively on the BEAM — no NIF threads for execution.

What

Reuses the existing QuickJS compiler (via NIF) to produce bytecode, then executes it in a pure Elixir interpreter. QuickBEAM.eval(rt, code, mode: :beam) compiles via NIF, runs on BEAM.

Architecture

JS source → QuickJS compiler (NIF) → QJS bytecode binary
                                         │
                                         ▼
                               Bytecode decoder (Elixir)
                                         │
                                         ▼
                               Instruction interpreter (Elixir)
                                    + JS runtime library
                                         │
                                         ▼
                                    JS result

Performance

3.5–4.3x faster than QuickJS C NIF on numeric workloads:

sum(1000): BEAM VM 86µs vs NIF 345µs
sum(50K):  BEAM VM 3.9ms vs NIF 13.5ms

Test coverage

546 beam VM tests, 284 dual-mode NIF/BEAM comparisons:

Suite Tests
Bytecode decoder 25
Interpreter unit 69
Beam mode integration 16
Compat (quickbeam_test mirror) 149 (+3 class excluded)
Dual-mode NIF=BEAM comparison 284

JS features supported

Primitives, arithmetic, comparison, logical/bitwise operators, strings (20 methods), arrays (22 methods), objects (Object.keys/values/entries/assign), Math (16 functions + constants), JSON parse/stringify, Number/parseInt/parseFloat, closures with mutable cells, for/while/do-while/for-in, break/continue, switch, try/catch/finally, destructuring, spread/rest, template literals, typeof, optional chaining, nullish coalescing, Unicode (Latin-1, CJK, emoji with surrogate pairs).

Known gaps

  • Class syntax (define_class / define_method prototype chain setup)

  • this-binding when methods are wrapped via fclosure8

  • No cross-eval state persistence (each eval is independent)

Files

File Lines Purpose
beam_vm/opcodes.ex 240 246 opcodes, BC_TAG constants
beam_vm/leb128.ex 70 LEB128 codec
beam_vm/bytecode.ex 170 Binary format parser (atoms, functions, constants)
beam_vm/predefined_atoms.ex 230 JS_ATOM indices 1-228
beam_vm/decoder.ex 190 Bytecode → instruction tuples with label resolution
beam_vm/interpreter.ex 1500 Stack interpreter, 130+ opcode handlers
beam_vm/runtime.ex 120 Property resolution, prototype chains
beam_vm/runtime/array.ex 300 Array.prototype methods
beam_vm/runtime/string.ex 160 String.prototype methods
beam_vm/runtime/builtins.ex 210 Math, Number, Boolean, console, global functions
beam_vm/runtime/json.ex 50 JSON.parse/stringify via OTP :json
beam_vm/runtime/object.ex 50 Object static methods
beam_vm/runtime/regexp.ex 40 RegExp stub

dannote added 6 commits April 15, 2026 01:10
Implement pure Elixir parser for QuickJS bytecode binaries:

- LEB128: unsigned/signed LEB128 reading, u8/u16/u32/u64/i32 helpers
- Opcodes: all 246 QuickJS opcodes, BC_TAG constants, BC_VERSION=24
- Bytecode: full deserialization matching JS_ReadObjectAtoms/JS_ReadFunctionTag
  - Atom table, objects (null, undefined, bool, int32, float64, string,
    function_bytecode, object, array, bigint, regexp)
  - Function bytecode: flags (raw u16 LE), locals, closure vars, constant pool,
    raw bytecode bytes, debug info
  - Correct atom resolution: predefined atoms (<229) vs user atom table
  - 25 tests all passing
Implements a QuickJS bytecode interpreter running natively on the BEAM:

Decoder:
- Two-pass decoding: first pass builds byte-offset→instruction-index map,
  second pass decodes instructions with label resolution
- All operand formats: u8/i8/u16/i16/u32/i32, labels (8/16/32), atoms,
  const pool indices, local/arg/var_ref (u16), npop
- Label resolution: relative byte offsets → instruction indices
- Atom operand format: writer index resolution (predefined vs user atom table)

Interpreter:
- Flat function args dispatch loop with tail-recursive run/3
- One defp per opcode, gas counter for cooperative scheduling
- Pre-decoded instruction tuple for O(1) indexed access
- PC auto-advances via  tuple; branches use explicit targets
- JS value semantics: number/string/boolean/nil/:undefined
- Arithmetic (+, -, *, /, %, pow, neg, inc, dec), bitwise (&, |, ^, <<, >>, >>>)
- Comparisons (<, <=, >, >=, ==, !=, ===, !==) with JS abstract equality
- Control flow: if_true/8, if_false/8, goto/8/16, return, return_undef
- Stack manipulation: dup, drop, nip, swap, rot, perm, insert
- Locals/args: get/put/set variants (including short forms 0-3)
- Functions: fclosure/8, call/0-3, tail_call, call_method
- Unary: neg, plus, inc, dec, not, lnot, typeof
- Global vars: get_var_undef, get_var, put_var, put_var_init (return :undefined)

Key fixes during development:
- Bytecode flags: has_debug_info is bit 11 (not bit 10); full debug skip
  (filename, line, col, pc2line, source)
- Atom operands use writer index format (u32 >= JS_ATOM_END → atom table)
- loc/arg/var_ref operands are u16 (not u32); const is u32
- Label8/16 must resolve through offset map (not raw byte offsets)
- tail_call/tail_call_method throw return directly (no continuation)
- call_function/call_method advance PC by 1 before continuing

Tests: 34 interpreter tests + 25 bytecode tests = 59 total, all passing
…el resolution

Critical fixes to make the BEAM VM interpreter work correctly:

Args vs locals:
- Arguments accessed via get_arg/0-3 read from process dictionary (:qb_arg_buf),
  separate from locals. In QuickJS, args are in arg_buf, not in the var_buf.
- invoke_function stores args in process dict, not in local slots.
- Fixes set_loc_uninitialized overwriting parameter values.

Post-inc/dec stack order:
- post_inc pushes [new, old] (new on top), not [old, new].
- Matches QuickJS C: sp[-1] = val+1, then push old val above.
- put_loc_check after post_inc now correctly writes incremented value.

Label resolution:
- label8/16 operands now resolve through byte-offset→instruction-index map
- Previously returned raw byte offsets, causing jumps to wrong instructions.

Atom resolution:
- Fixed get_atom_u32: check v >= JS_ATOM_END BEFORE band(v,1) tagged int check.
- Prevents atom table index 0 from being misidentified as tagged_int(114).

Operand sizes:
- Fixed loc/arg/var_ref format: reads u16 (not u32), matching QuickJS C.
- const format reads u32 (correct).

Benchmark results (sum loop):
- BEAM VM: 86µs for sum(1000), 3.9µs for sum(50000)
- NIF QJS: 375µs for sum(1000), 135µs for sum(50000)
- BEAM VM is 3.5-4.3x faster than QuickJS C NIF across all sizes!
…sing

Major additions to the BEAM VM interpreter:

Objects (mutable via process dictionary):
- object opcode creates {:obj, ref} with process dict storage
- define_field, get_field, put_field all use atom-resolved keys
- Nested objects work (object values stored as {:obj, ref})
- get_length supports obj/map/list/string

Closures:
- fclosure builds {:closure, captured_map, function} tuples
- Captures variables from both locals and arg_buf
- invoke_closure sets up var_refs from captured values
- get_var_ref/get_var_ref_check read from vrefs list

Named function self-reference:
- special_object(2) pushes current function for named recursion
- Stored in process dict (:qb_current_func) during do_invoke
- Enables factorial, fibonacci via get_loc + call

New opcode handlers (25+):
- define_var, check_define_var — variable declarations
- get_field2 — computed property access
- catch, nip_catch — try/catch
- for_in_start, for_in_next — for-in loops
- call_constructor, init_ctor — new X()
- instanceof, delete, in — operators
- regexp, append, define_array_el — regex/spread
- make_var_ref/make_arg_ref/make_loc_ref — closure cell creation
- get_ref_value, put_ref_value — cell read/write
- gosub, ret — finally blocks
- for_of_start/next, iterator_* — iterator stubs
- push_this, set_home_object, set_proto — class stubs
- And more

Critical fixes:
- insert2/3/4: stack order corrected (obj a → a obj a)
- define_field: only pushes obj (consumes value), matching QuickJS
- put_field: mutates object in-place via process dict
- resolve_atom(:empty_string) returns ""
- build_closure reads from both locals AND arg_buf

Test coverage: 69 tests, 0 failures
- New: objects (5), arrays (5), closures (2), strings (4), null/undef ops (6),
  short-circuit (4), ternary (3), modulo/power (2), complex (4)
…ON, and more

Implements QuickBEAM.BeamVM.Runtime with JS built-in constructors, prototype
methods, and global functions. All property access now goes through the
runtime's prototype chain resolution.

Built-in objects:
- Array: push, pop, shift, unshift, map, filter, reduce, forEach, indexOf,
  includes, slice, splice, join, concat, reverse, sort, flat, find, findIndex,
  every, some, toString
- String: charAt, charCodeAt, indexOf, lastIndexOf, includes, startsWith,
  endsWith, slice, substring, substr, split, trim, trimStart, trimEnd,
  toUpperCase, toLowerCase, repeat, padStart, padEnd, replace, replaceAll,
  match, concat, toString, valueOf
- Object: keys, values, entries, assign, freeze, is, create
- Math: floor, ceil, round, abs, max, min, sqrt, pow, random, trunc, sign,
  log, log2, log10, sin, cos, tan, PI, E, LN2, LN10, etc.
- JSON: parse, stringify (via Jason)
- Number: toString, toFixed, valueOf; global parseInt, parseFloat, isNaN, isFinite
- Boolean: toString, valueOf
- Error: constructor with message property
- RegExp: test, exec, source, flags, toString
- Date: constructor, now()
- Console: log, warn, error, info, debug
- Symbol, Promise, Map, Set constructors

Runtime integration:
- Runtime.get_property/2 handles full prototype chain for arrays, strings,
  numbers, booleans, objects, regexps
- Interpreter wired: get_field → Runtime.get_property, get_var → global bindings
- call_function/call_method handle {:builtin, name, callback} tuples
- Builtin callbacks support 1-arity (simple), 2-arity (with this), 3-arity
  (with interpreter for higher-order functions like map/filter/reduce)

Critical fixes:
- Predefined atom table: indices 1-228 (atom 0 = JS_ATOM_NULL, not a real atom)
- Atom encoding in bytecode: emit_atom writes raw JS_Atom values, not
  bc_atom_to_idx. Tagged ints have bit 31 set (not bit 0).
- resolve_atom({:predefined, idx}) now looks up actual string name from
  PredefinedAtoms table instead of returning opaque tuple

Tests: 94 tests (69 interpreter + 25 bytecode), 0 failures
@dannote dannote force-pushed the beam-vm-interpreter branch from 0eb3475 to 7c1c574 Compare April 15, 2026 14:06
dannote added 15 commits April 15, 2026 18:29
Phase 3: Dual-mode execution API
- QuickBEAM.eval(rt, code, mode: :beam) compiles via NIF then executes
  on the BEAM VM interpreter. Default mode: :nif (unchanged).
- convert_beam_result/1 converts interpreter values (atoms, obj refs,
  :undefined) to standard Elixir values for API compatibility.

Critical fixes:
- inc_loc/dec_loc/add_loc: locals update was computed but discarded
  (used 'next' frame instead of updated locals). Caused infinite loops.
- Default gas increased to 1B (100M was tight for nested function calls).
- get_field2: now correctly pops 1 and pushes 2 (keeps object for
  call_method this-binding). Previous handler consumed the object.
- get_field2: handler now accepts atom operand (was matching []).
- Atom encoding: predefined atoms (1-228) vs user atoms (>=229) vs
  tagged ints (bit 31). Matches bc_atom_to_idx/bc_idx_to_atom exactly.
- :json module used for JSON parse/stringify (returns value directly,
  not {:ok, val} tuples). Rescue on decode errors.

Beam mode integration tests: 16 tests covering arithmetic, functions,
control flow, objects, arrays, built-ins (Math), loops.
Arrays are now stored as {:obj, ref} in process dictionary for in-place
mutation. All array methods (push, pop, map, filter, reduce, forEach,
reverse, sort, join, slice, indexOf, includes, find, findIndex, every,
some, concat, flat) handle {:obj, ref} by dereferencing the list.

Critical fixes:
- tail_call and tail_call_method: added builtin dispatch (was only
  handling Bytecode.Function and closures)
- get_field2: fixed stack semantics (pops 1, pushes 2 to keep obj)
- get_length: handles list-backed {:obj, ref} arrays
- get_array_el: handles {:obj, ref} arrays
- inc_loc/dec_loc/add_loc: locals update was discarded (used next frame)
- String.prototype dispatch: fixed String.prototype_method → string_proto_property
- NaN !== NaN: custom js_strict_eq with :nan handling
- typeof: handles :nan, :infinity, {:builtin, _, _}
- Math.max/min: no longer forces float conversion
- JSON.stringify: converts iodata to binary
- :binary.match: fixed incorrect scope option
- Global bindings: added NaN, Infinity, console

Compat score: 87/91 JS features pass through beam mode
runtime.ex (937 → 181 lines) now holds only property resolution,
global_bindings, call_builtin_callback, and shared helpers.

New sub-modules under runtime/:
  array.ex    (285) — Array.prototype + Array static
  string.ex   (155) — String.prototype
  builtins.ex (193) — Math, Number, Boolean, Console, constructors, globals
  json.ex      (45) — JSON.parse/stringify
  object.ex    (52) — Object static methods (keys, values, entries, assign)
  regexp.ex    (40) — RegExp prototype (test, exec, source, flags)

Cross-module calls promoted from defp to def:
  js_truthy, js_to_string, js_strict_eq, to_int, to_float, to_number,
  norm_idx, normalize_index, obj_new, call_builtin_callback

Cleanup during split:
- Removed duplicate entries in global_bindings (NaN, Infinity, console)
- Deduplicated {:obj, ref} variants in array_flat/find/findIndex/every/some
- Removed dead put_back_array function
- Fixed RegExp.to_string naming conflict with Kernel.to_string/1
Try/catch mechanism:
- catch opcode pushes a catch offset marker and records handler in
  process dictionary catch stack
- throw checks catch stack: if handler exists, restores stack to
  catch point and pushes thrown value, jumps to handler
- nip_catch pops the catch offset from stack and handler from catch stack
- If no catch handler, throw propagates to eval boundary

Computed property assignment:
- put_array_el now actually stores values in {:obj, ref} objects
  (was a no-op). Handles both list-backed arrays (numeric keys) and
  map-backed objects (string keys)

JSON.stringify fix:
- :json.encode iodata converted to binary via IO.iodata_to_binary

Compat: 90/91 JS features pass through beam mode. Only remaining gap
is forEach with closure mutation (var_ref write across closures).
Closures now use shared mutable cells stored in the process dictionary,
enabling proper variable mutation across function boundaries.

How it works:
- setup_captured_locals: when invoking a function with captured locals
  (is_captured=true, var_ref_idx), creates a {:cell, ref} for each
  and stores local→vref mapping in process dict
- build_closure: reuses parent's existing cells (via :qb_local_to_vref)
  instead of creating new ones — ensures mutations are shared
- get_loc/put_loc/set_loc: check :qb_local_to_vref mapping and
  redirect reads/writes through the shared cell
- get_var_ref/put_var_ref/set_var_ref: read/write from cell tuples
  passed in the vrefs list

Also fixes:
- put_array_el: now stores values in {:obj, ref} objects (was no-op)
- try/catch: proper catch stack with catch offset markers
- JSON.stringify: IO.iodata_to_binary for :json.encode output

Compat: 91/91 JS features pass through beam mode. 0 failures.
Review fixes (a79227d + 9a5b594):

1. Remove duplicate get_arg opcode (line 232 vs 284) and dead
   put_arg/set_arg handlers — args are read from :qb_arg_buf process
   dict, not locals
2. Fix :qb_local_to_vref stale mapping: convert from per-key process
   dict entries {:qb_local_to_vref, idx} to single map stored under
   :qb_local_to_vref atom. save/restore in do_invoke prevents inner
   functions from clobbering outer mappings
3. Fix regexp opcode underscored variables (_pattern/_flags → pattern/flags)
4. Remove unused obj_get/2, get_field/2, get_property/2 private fns
5. IO.iodata_to_binary in JSON.stringify IS needed (:json.encode
   returns iodata, not binary) — reviewer note was incorrect
6. Save/restore :qb_catch_stack in do_invoke after block
7. Fix inc_loc/dec_loc/add_loc to update captured cells via
   write_captured_local

Also fixes define_var/check_define_var operand arity (atom_u8 = 2
operands, was matching only 1). New tests: 91/91 compat, 110 unit.
Comprehensive test suite mirroring existing QuickBEAM tests through
beam mode, covering 152 test cases across 25 describe blocks:

- Basic types, arithmetic, comparison, logical operators
- String operations (16 methods)
- Arrays (22 methods + Array.isArray)
- Objects (10 operations including Object.keys/values/entries)
- Functions (closures, arrow, recursive, higher-order, rest params)
- Control flow (if/else, ternary, while, for, for-in, do-while,
  break, continue, switch)
- typeof, destructuring, spread
- Math (10 functions + constants), JSON, parseInt/parseFloat
- Try/catch/finally, errors, null vs undefined
- Bitwise operators, template literals, edge cases
- Classes, generators, Map/Set (graceful skip if unsupported)

New opcode implementations:
- set_arg/set_arg0-3: argument mutation for default/rest params
- get_array_el2: 2-element array access (destructuring prep)
- apply: Function.prototype.apply semantics
- copy_data_properties: object spread operator
- for_of_next: for...of iterator protocol
- define_method/define_method_computed: class method definitions
- define_class/define_class_computed: class declarations

Other fixes:
- put_var/put_var_init: now store values in globals (was no-op)
- get_var: throws ReferenceError for undeclared variables
- get_var_undef: returns undefined for undeclared (not error)
- resolve_global: distinguish not-found from value=undefined
  via {:found, val} / :not_found tuple
- call_constructor: handles builtin constructors (Error etc),
  adds name property automatically
- Error objects: convert_beam_value now dereferences {:obj, ref}
  for thrown errors
- append opcode: fix stack order (was 2-elem, should be 3→2)
- number_to_fixed: fix :erlang.float_to_binary OTP 26+ options
- Number.isNaN/isFinite/isInteger static methods
- set_global helper for put_var
Fixes for_of_next, for_in_next, and iterator_next stack order:
done_flag must be on top (head) for if_false to check correctly.
Previously iter was on top, causing drop to remove the iterator
instead of the done flag, breaking destructuring and for-in loops.

New opcode implementations:
- push_this: reads :qb_this from process dict (constructor this)
- check_ctor: no-op (validates constructor context)
- check_ctor_return: returns this or the explicit return value
- return_undef: returns :qb_this for constructor returns

call_constructor rewrite:
- Creates new object and stores as :qb_this before invoking ctor
- Restores previous :qb_this in after block
- Returns the new object if ctor doesn't return an object

Other fixes:
- define_array_el: keep idx on stack (was popping all 3), handle
  out-of-bounds by extending array
- append: handle {:obj, ref} arrays (stored in process dict)
- set_arg: expand arg_buf tuple when idx >= current size
- copy_data_properties: fix mask-based stack indexing (sp[-1-n])
- for_in_start: extract keys from actual object (was empty stub)
- for_of_start: create real iterator from array/object (was stub)
- Remove duplicate for_of_next/iterator_close handlers
- Remove debug logger statements

Compat: 149/152 pass (3 class tests excluded pending full
js_op_define_class implementation). Spread, destructuring, for-in,
for-of, default params all working.
Comprehensive test coverage for JS built-in objects in beam mode:

Array.prototype (62 tests):
  push, pop, shift, unshift, map, filter, reduce, forEach,
  indexOf, includes, slice, splice, join, concat, reverse,
  sort, find, findIndex, every, some, flat, Array.isArray

String.prototype (38 tests):
  charAt, charCodeAt, indexOf, lastIndexOf, includes,
  startsWith, endsWith, slice, substring, split, trim,
  trimStart, trimEnd, toUpperCase, toLowerCase, repeat,
  padStart, padEnd, replace, replaceAll, concat

Object static (14 tests):
  keys, values, entries, assign, freeze

Math (32 tests):
  floor, ceil, round, abs, max, min, sqrt, pow, trunc,
  sign, random, log, log2, log10, sin, cos, tan, constants

JSON (14 tests):
  parse (object, array, string, number, boolean, null, nested)
  stringify (object, array, string, null, boolean, round-trip)

Number (10 tests):
  Number() conversion, isNaN, isFinite, isInteger,
  MAX_SAFE_INTEGER, toFixed

Global functions (10 tests):
  parseInt (with radix), parseFloat, isNaN, isFinite

Error constructors (4 tests):
  Error, TypeError, RangeError

Type coercion (12 tests):
  string+number, boolean+number, String(), Boolean()

Operators (18 tests):
  NaN equality, null/undefined, bitwise, integer edge cases

Bug fixes:
- JSON.stringify: handle {:obj, ref} arrays (was crashing
  on Map.new with list input)
- String.indexOf: handle empty needle (return 0, not crash
  from :binary.match with empty pattern)
…ap access

New test file: wpt_language_test.exs (59 tests, 54 pass, 5 pending)

Test coverage:
- Variable scoping: var hoisting, let block scope, let in for loops
- Closure patterns: counter, accumulator, forEach mutation,
  nested closures, IIFE capture
- Iteration: nested loops, while+break, for+continue,
  map/filter/reduce chains, forEach building objects
- Error handling: catch+continue, finally, nested try/catch,
  try/catch in loops
- Object patterns: computed properties, method calls, deletion,
  in operator, for-in, nested access
- Recursion: factorial, fibonacci, binary search, tree traversal
- String processing: word count, reverse words, capitalize,
  camelCase→kebab, count occurrences
- Array algorithms: unique, flatten, group by, zip, insertion sort
- Switch: matching, default, fall-through, string switch
- Conditionals: nullish coalescing, optional chaining, short-circuit
- Destructuring: array, object, nested, swap
- Template literals: expressions, multipart, nested ternary
- Real-world: memoized fib, event emitter, linked list, pipeline,
  deep clone, matrix operations

Interpreter fixes:
- call_method: set :qb_this around function/closure invocations
  so push_this returns the correct receiver object
- tail_call_method: same this-binding fix, also pass obj as first
  arg to functions (was missing)
- get_array_el: handle {:obj, ref} map objects (not just lists),
  support both integer and string keys for map lookup

Pending (5 tests):
- Object methods with this (push_this not reached — needs further
  investigation of closure wrapping in fclosure8)
- Gas exhaustion on deep recursion (fib(30), binary search,
  tree traversal with gas halving)
Replace hand-written wpt_builtins_test.exs and wpt_language_test.exs
with dual_mode_test.exs that runs identical JS expressions through both
NIF (QuickJS C) and BEAM interpreter, asserting matching results.

This approach catches real semantic divergences mechanically instead
of testing against hand-written expected values.

252 expressions tested across: primitives (50), String (31), Array (39),
Object (14), Math (16), JSON (10), global functions (23), control flow
& functions (27), type coercion (8). All NIF/BEAM pairs match.

Bugs found and fixed by dual-mode comparison:

- Bitwise NOT (~): was bsl/&&& instead of Bitwise.bnot
- delete operator: was no-op, now removes key from {:obj, ref} maps
- in operator: stack order was swapped (key/obj reversed)
- splice: was not mutating {:obj, ref} arrays, incorrect element
  removal logic
- Array.isArray: didn't recognize {:obj, ref} arrays
- flat: didn't deref {:obj, ref} sub-arrays
- parseInt: didn't handle radix argument ("ff",16)
- JSON.stringify(null): encoded as '"nil"' instead of 'null'
  (:json.encode needs :null atom, not nil)
- JSON.parse('null'): returned :null atom instead of nil
  (:json.decode returns :null, need to_js conversion)
- put_arg: was not implemented (needed for default param bytecode)
Add serialization edge cases (nested objects, mixed-type arrays,
deeply nested property access) and complex patterns (fibonacci,
factorial, map/filter/reduce chains, closure counters, forEach
mutation, JSON round-trips, string pipelines, computed properties,
sorted arrays).

Skipped non-ASCII string literals (héllo, 日本語) — bytecode
decoder doesn't handle multi-byte UTF-8 string atoms yet.
Three fixes for string encoding in QJS bytecode:

1. Latin-1 → UTF-8: non-wide strings in QJS bytecode are stored as
   Latin-1 (ISO-8859-1), not UTF-8. Characters like é (0xE9) were
   returned as raw bytes. Now converted via <<b::utf8>> expansion.

2. Wide string byte count: wide strings (is_wide=1) store char count
   in the length field, but each char is 2 bytes (UTF-16). The byte
   read was using char count instead of char_count * 2, causing
   :unexpected_end for CJK strings (日本語, こんにちは世界).

3. UTF-16 surrogate pairs: emoji and other characters above U+FFFF
   are stored as surrogate pairs in wide strings. The wide_to_utf8
   decoder now properly combines high+low surrogates into codepoints
   before converting to UTF-8. Previously <<surrogate::utf8>> crashed
   with :badarg since surrogates aren't valid UTF-8.

Also fixes String.length to return UTF-16 code unit count (matching
JS spec) instead of Unicode grapheme count. "🎉".length now
correctly returns 2, not 1.
@dannote dannote changed the title BEAM-native JS interpreter (Phase 0-1) BEAM-native JS interpreter Apr 16, 2026
@dannote dannote marked this pull request as ready for review April 16, 2026 08:41
dannote added 6 commits April 16, 2026 12:24
The div(gas, 2) per function call was artificial — BEAM handles
deep recursion natively via tail call optimization. The gas counter
is for cooperative scheduling (like BEAM reductions), not stack
depth. Memoized fib(30) now works.
Three interpreter bugs fixed:

1. push_this stub shadowed real handler: a duplicate {:push_this, []}
   clause at line 955 always returned :undefined, preventing the real
   handler (which reads :qb_this) from executing. Object methods like
   o.f() where f reads this.x now work correctly.

2. get_loc0_loc1 push order: was [local0, local1] (local0 on top),
   should be [local1, local0] (local1 on top, matching QuickJS C
   where sp++ pushes local0 first then local1). Fixed var ordering
   bug where 'var a=[]; var m=Math.floor(1); a[m]' returned nil.

3. Gas passthrough (previous commit): enables memoized fib(30) and
   other deep recursion patterns.

New dual-mode test expressions: this-binding, get_loc0_loc1 ordering,
memoized fib(30).
…guard

- rest opcode: was stub returning empty array, now collects args from
  start_idx into {:obj, ref} array. (...a) works.
- String bracket indexing: get_array_el now handles binary strings,
  "hello"[1] returns "e" instead of nil.
- new Array(3): Array constructor now returns {:obj, ref} (was plain
  list that call_constructor couldn't handle). get_length also guards
  against nil stored values.
- call_constructor: only add "name" property for Error-family
  constructors (was crashing on Array/other constructors by calling
  Map.has_key? on a list).
- get_loc0_loc1 fix from previous commit enables correct variable
  ordering for all Math.floor/array combinations.

295 dual-mode expressions now match NIF output.
…array_el maps

- Computed property keys ({[k]:1}): define_array_el now handles
  {:obj, ref} with map storage (was only handling list storage),
  converts key to string for map properties
- charAt(-1): returns '' per spec instead of wrapping around to
  last character (String.at allows negative indices in Elixir)
- Array.prototype.lastIndexOf: new implementation scanning from
  end, using js_strict_eq for comparison
- Array.prototype.toString: delegates to join(',')
- define_array_el: cond-based dispatch for list vs map storage

302 dual-mode NIF/BEAM expressions all matching.
…e cases

Negative zero:
- js_neg(0) now returns -0.0 (BEAM integers don't have -0)
- js_div detects -0.0 divisor via IEEE 754 binary comparison
- 1/(-0) correctly returns -Infinity

Infinity/NaN arithmetic:
- js_to_number handles :infinity, :neg_infinity, :nan atoms
- js_neg handles Infinity/NaN (was falling through to -:nan crash)
- js_add/js_sub: special value propagation (Inf+Inf=Inf, Inf-Inf=NaN)
- js_mul: Infinity*0=NaN, sign handling for Infinity*negative
- js_strict_eq: :infinity === :infinity, :neg_infinity === :neg_infinity

New built-ins:
- String.fromCharCode(72,101,108,108,111) → 'Hello'
- JSON.stringify(undefined) → nil (was returning 'null' string)
- Array.prototype.lastIndexOf
- Array.prototype.toString (delegates to join)

Other fixes:
- charAt(-1) returns '' (was wrapping to last char via Elixir negative index)
- define_array_el handles map-backed {:obj,ref} (computed property keys)
- rest opcode collects actual args (was returning empty array)
- call_constructor: only add 'name' for Error-family constructors
- get_array_el: string bracket indexing ('hello'[1] → 'e')
- js_mod: returns NaN for zero divisor (was crashing)

309 dual-mode NIF/BEAM expressions all matching.
New built-ins:
- Map: constructor, get, set, has, delete, forEach, size
- Set: constructor, has, add, delete, size
- WeakMap, WeakSet, WeakRef, Proxy: stub constructors
- String.fromCharCode: converts code points to string
- Object.defineProperty: basic value descriptor support
- Object.getOwnPropertyNames: alias for Object.keys

Opcode fixes:
- put_loc8/get_loc8/set_loc8: added passthrough alias expansion
  (was unimplemented, caused crash on functions with >3 locals)
- define_method: keep target object on stack (was popping both
  method and target). Fixes method shorthand {f(){return 42}}

Prototype dispatch:
- Map/Set objects detected by __map_data__/__set_data__ keys
- get_prototype_property for {:obj, ref} now dispatches to
  map_proto/set_proto when internal markers present

320 dual-mode NIF/BEAM expressions matching.
dannote added 5 commits April 16, 2026 14:26
Classes:
- define_class stack order: was [parent, ctor], should be [ctor, parent]
  (top of stack = bfunc = sp[-1], second = parent = sp[-2])
- define_class push order: [proto, ctor] (proto on top, matching QuickJS
  sp[-1]=proto, sp[-2]=ctor)
- call_constructor: pass var_ref with false cell for class constructors
  so get_var_ref_check [0] succeeds (skips super() call path for
  base classes)
- Basic class constructors now work: new P(5).x returns 5

Map/Set:
- Removed duplicate Map/Set entries in global_bindings (first entry
  was a stub that shadowed real constructor)
- Removed duplicate map_constructor/set_constructor stubs in builtins
- Fixed Runtime.obj_new reference (__MODULE__.obj_new)
- Map.get/set/has/delete/forEach and Set.add/has/delete all working

arguments object:
- special_object type 1 now creates arguments array-like object
  from :qb_arg_buf process dict, stored as {:obj, ref}
- arguments.length and arguments[i] both work

580/582 tests pass, 2 excluded (class method + inheritance need
prototype chain walking).
…methods

Review fixes:
1. Remove duplicate opcode handlers (put_arg, check_ctor,
   check_ctor_return, return_undef). Kept correct implementations,
   removed stubs that shadowed them.
2. IO.iodata_to_binary stays — :json.encode returns iodata (list),
   not binary. Verified with elixir -e.
3. number_to_fixed: guard for :nan/:infinity atoms before float
   conversion. Use n*1.0 instead of n/1.
4. js_string_length: fast path for ASCII (byte_size == String.length
   skips charlist allocation).
5. neg_zero?: use :erlang.float_to_list sign check instead of
   hardcoded IEEE 754 binary pattern.
6. Remove empty line gap between opcode sections.

Class prototype chain:
- define_class stores proto ref via :qb_class_proto keyed by ctor hash
- call_constructor sets __proto__ on new instance pointing to class proto
- get_prototype_property walks __proto__ chain for property lookup
- Object.keys hides internal __ keys
- Class methods now work: new R(3,4).area() returns 12
- Class basic constructor: new P(5).x returns 5

581/582 beam VM tests pass. 1 excluded: class inheritance (extends/super
needs full super() call dispatch).
New opcodes:
- get_super: resolves parent constructor from class hierarchy
  (stored via :qb_parent_ctor keyed by function hash)
- special_object type 3: new.target (returns current function)

define_class improvements:
- Stores parent ctor reference for get_super lookup
- Sets __proto__ on child prototype pointing to parent prototype
- call_constructor sets __proto__ on result object

Known limitation: class extends with explicit super() call hangs
due to stack management in the derived constructor bytecode path.
Class basic and class methods work. Inheritance deferred to follow-up.
call_constructor now correctly pops argc + 2 items (args, new_target,
func_obj) matching QuickJS C behavior. Previously it only popped
argc + 1, treating new_target as func_obj. This worked for normal
new Func() (where dup makes both the same) but broke super() calls
in derived constructors where new_target != parent_ctor.

Also adds get_super opcode and special_object type 3 (new.target)
for derived class constructors. define_class now stores parent
constructor reference for get_super lookup and chains prototypes.

Class inheritance with explicit super() still fails — the derived
constructor's call_constructor triggers the parent constructor
twice due to a re-entry bug in the bytecode dispatch flow. The
var_ref cell for the super-called flag gets consumed on the first
call, leaving :undefined for the second.
Restore missing _ -> this_obj fallback in call_constructor case
statement (lost during debug cleanup). Remove all debug IO.puts.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant