Closed Bug 565374 Opened 15 years ago Closed 5 years ago

Front end analysis for Proper Tail Calls

Categories

(Core :: JavaScript Engine, enhancement, P3)

enhancement

Tracking

()

RESOLVED WONTFIX

People

(Reporter: brendan, Unassigned)

References

(Blocks 1 open bug, )

Details

Attachments

(1 file)

The analysis can be used to select JSOP_POP instead of JSOP_POPV. /be
Looking for early review. I'll cobble up some kind of test-o-matic test. /be
Attachment #444956 - Flags: review?
Blocks: 565534
Attachment #444956 - Flags: review? → review?(dherman)
Status: NEW → ASSIGNED
cdleary, could you check out this patch's effect on parsemark? Thanks! /be
(In reply to comment #2) > cdleary, could you check out this patch's effect on parsemark? Thanks! Insignificant.
I spent some time this weekend looking at the analysis at the URL. Unless I've gotten myself confused, that analysis is a little broken. More to the point, the analysis is unsound for the optimization in comment #0. Specifically, a falls-off-the-end analysis is determining "MAY be the completion" not "MUST be the completion." So for example, the analysis would tell you that in the block: { 42; while (false) { f(); } } the while-loop is in "tail position" and the value 42 can be discarded. But that would be wrong! The completion value of the block should be 42. What's needed for the JSOP_POP optimization is a MUST-NOT-complete analysis. The whole idea of a statement being in tail position is kind of meaningless (which is why I think the attribute grammar at the URL is wrong). I'll post another comment with a proposed analysis. Specifically, we want: E.tail inherited E is in tail position S.mute inherited S's completion will always be ignored S.wrapped inherited if S terminates, the activation is still needed S.completes synthesized if S terminates, it MUST produce a completion Dave
Here's a draft analysis that tries to determine (a) what expressions are in tail position (E.tail) and (b) what statements' completion values can be ignored (S.mute) The latter is undecidable for JS, but this provides a conservative approximation. You can use whatever conservative heuristic you want for the "nonTrivial" property in loops; presumably we'd do something cheap and constant time, but bhackett's static analysis might be able to help us be more clever. (Not a big enough win to be worth it?) ---------------------------------------------------------------------- Expressions: E -> E1 , E2 E2.tail = E.tail E -> E1 ? E2 : E3 E2.tail = E3.tail = E.tail E -> let (V1 = E1, ..., Vn = En) E{n+1} E1.tail = ... = En.tail = false E{n+1}.tail = E.tail Statements and clauses (synthesized attributes): S -> { S1 ... Sn } S.completes = !S.jumps && exists i . Si.completes S -> E; S.completes = true S -> do S1 while (E) S.completes = S1.completes /* must loop at least once to complete */ S -> while (E) S1 S.completes = E.nonTrivial && S1.completes S -> for (E1; E2; E3) S1 S.completes = E2.nonTrivial && S1.completes S -> for (E1 in E2) S1 S.completes = E2.nonTrivial && S1.completes S -> if (E) S1 else S2 S.completes = S1.completes && S2.completes S -> L: S1 S.completes = S1.completes S -> return E; S.completes = true S -> break L; S.completes = false S -> continue L; S.completes = false S -> with (E) S1 S.completes = S1.completes S -> try S1 K1 ... Kn S.completes = S1.completes && forall i . Ki.completes S -> try S1 K1 ... Kn finally S2 S.completes = S1.completes && S2.completes && forall i . Ki.completes S -> throw E; S.completes = true S -> switch (E) C1 ... Cn S.completes = forall i . Ci.completes K -> catch (V) S K.completes = S.completes C -> case E: S1 ... Sn C.completes = exists i . Si.completes C -> default: S1 ... Sn C.completes = exists i . Si.completes Statements and clauses (inherited attributes): S -> { S1 ... Sn } S1.wrapped = ... = Sn.wrapped = S.wrapped forall i . Si.mute = exists j > i . Sj.completes S -> E; E.tail = false S -> do S1 while (E) S1.mute = true S1.wrapped = S.wrapped E.tail = false S -> while (E) S1 S1.wrapped = S.wrapped S1.mute = S.mute E.tail = false S -> for (E1; E2; E3) S1 E1.tail = E2.tail = E3.tail = false S1.mute = S.mute S1.wrapped = S.wrapped S -> for (E1 in E2) S1 E1.tail = E2.tail = false S1.wrapped = S.wrapped S1.mute = S.mute S -> if (E) S1 else S2 S1.wrapped = S2.wrapped = S.wrapped S1.mute = S2.mute = S.mute E.tail = false S -> L: S1 S1.wrapped = S.wrapped S1.mute = S.mute /* tail unless inside of try etc. */ S -> return E; E.tail = !s.wrapped /* no tail positions inside */ S -> with (E) S1 S1.wrapped = true S1.mute = S.mute E.tail = false S -> try S1 K1 ... Kn S1.wrapped = true K1.wrapped = ... = Kn.wrapped = S.wrapped S1.mute = S.mute K1.mute = ... = Kn.mute = S.mute S -> try S1 K1 ... Kn finally S2 S1.wrapped = true K1.wrapped = ... = Kn.wrapped = true S2.wrapped = false S1.mute = S.mute /* finally might not complete! */ K1.mute = ... = Kn.mute = S.mute S2.mute = S.mute S -> throw E; E.tail = false S -> switch (E) C1 ... Cn E.tail = false C1.wrapped = ... = Cn.wrapped = S.wrapped C1.mute = ... = Cn.mute = S.mute K -> catch (V) S S.wrapped = K.wrapped S.mute = K.mute C -> case E: S1 ... Sn E.tail = false S1.wrapped = ... = Sn.wrapped = C.wrapped S1.mute = ... = Sn.mute = C.mute C -> default: S1 ... Sn E.tail = false S1.wrapped = ... = Sn.wrapped = C.wrapped S1.mute = ... = Sn.mute = C.mute Functions: F -> function(x1,...,xn) S S.wrapped = false S.mute = true ---------------------------------------------------------------------- Dave
A couple last points. 1. It's annoying that the inherited S.mute attribute depends on the synthesized S.completes attribute -- i.e., requires a bottom-up pass before the top-down pass is possible. But the completes attribute could probably be computed in the parser. 2. The |wrapped| attribute is a property of the context and only needed as an intermediate attribute for the purposes of the analysis, so you could store it in the tree context instead of the parse node. 3. So despite the 4 attributes, we can (I think) still keep it down to 2 bits in the parse node: union { uint32 pn_mutepos:1; /* statement's completion value is unused */ uint32 pn_tailpos:1; /* expression is in tail position */ }; uint32 pn_completes:1; /* statement always produces a completion */ Dave
FYI: updated the new Harmony proposal for proper tail calls to contain an attribute grammar in the style of comment #5. http://wiki.ecmascript.org/doku.php?id=strawman:proper_tail_calls Dave
Target Milestone: mozilla1.9.3a5 → ---
This is still applicable for IonMonkey, correct?
I'm not sure if front-end analysis is needed once we have SSA and type inference.
> I'm not sure if front-end analysis is needed once we have SSA and type > inference. I don't really understand how that would work, but as long as you don't fail to identify any tail positions the front-end analysis would have identified, it's fine. Incidentally, how would you be using TI? I understand how you could use it to identify *additional* tail calls: function f() /* always returns undefined */ { g(); } function g() /* always returns undefined */ { ... } but those are not strictly required for ES6. Proper tail calls only require identifying (roughly) function calls that are in tail position of a return: function f() { return g(); } function g() { ... } But those don't need type information. Dave
> But those don't need type information. Ah, I just meant it's hard to take advantage of them unless you know what function you're calling at compile-time.
> Ah, I just meant it's hard to take advantage of them unless you know what > function you're calling at compile-time. I think we should be clear about the difference between Proper Tail Calls (PTC) and Tail Call Optimization (TCO). ES6 will require the former; the latter is an optional set of additional optimizations we could do. PTC does not require knowing the target of a call; it just requires a calling protocol that doesn't accumulate stack space for calls in tail position, regardless of whether the called functions are statically known. Dave
Outstanding question for Dave from IRC: when we hit one of these hypothetical |JSOP_TAILCALL|s and it has a non-zero number of arguments, we clobber the existing frame space with the (arguments and such for the) new frame, correct? It seems like this could wreak some havoc on things like stack-walking-dependent protocols and perhaps determining principals (which I understand we do in a stack-like fashion). Has anybody chatted with luke about how that stuff would continue to work without requiring accumulation of stack space?
> Outstanding question for Dave from IRC: when we hit one of these > hypothetical |JSOP_TAILCALL|s and it has a non-zero number of arguments, we > clobber the existing frame space with the (arguments and such for the) new > frame, correct? Right. > It seems like this could wreak some havoc on things like > stack-walking-dependent protocols and perhaps determining principals (which > I understand we do in a stack-like fashion). Has anybody chatted with luke > about how that stuff would continue to work without requiring accumulation > of stack space? It's solvable, but you have to hang on to whatever security principals the caller-frame had and merge them into the principals of the callee-frame. You can do a destructive merge since control will never return to the caller. This idea was first published in this paper: http://lambda-the-ultimate.org/node/1333 Dave
Comment on attachment 444956 [details] [diff] [review] analysis, unused as yet Cleaning up old to-do's. Dave
Attachment #444956 - Flags: review?(dherman) → review-
Assignee: brendan → nobody
No assignee, updating the status.
Status: ASSIGNED → NEW
No assignee, updating the status.
No assignee, updating the status.

The fronted is going to be replaced by rust and the whole state of tail-calls is still murky.

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: