Roll20 uses cookies to improve your experience on our site. Cookies enable you to enjoy certain features, social sharing functionality, and tailor message and display ads to your interests on our site and others. They also help us understand how our site is being used. By continuing to use our site, you consent to our use of cookies. Update your cookie preferences .
×
Create a free account

[Script] Airbag - API Crash Handler

1552604992

Edited 1553444134
GM Michael
API Scripter
We all try to make our code as stable as possible, but sometimes crashes still happen.  API crashes can easily stall a game, lead to confusion, slow script development, break game immersion, and cause enormous frustration as you struggle to understand what actually went wrong.  Users receive no warning a crash has occurred and must instead figure it out on their own and then navigate to the API page to restart the API.  Even then, all you have is a single console line lacking formatting, making it difficult to draw conclusions regarding the source of the problem. Airbag Airbag is a two-part script that wraps the rest of your codebase, isolating the fragile API from exceptions thrown by your installed scripts and providing direct insight to the user when a crash occurs as well as the ability to force an API restart. How Roll20's API concatenates all of your scripts into a single enormous file, hence the sometimes-astronomical line numbers when you receive an exception.  Ordinarily, all scripts are self-contained and are themselves compilable, but Airbag's halves are not individually valid, instead relying on each other to function.  By sandwiching the rest of your API code in the middle, Airbag acts as an oversized try-catch block and can override API functions called by your other scripts.  This means that exceptions within your installed scripts will be caught by Airbag, rather than the API, allowing Airbag to inform the user and prompt them to restart the dead scripts. Airbag does not neuter the value of in-script error handling.  Airbag will treat any unhandled exception that bubbles up to it as an unintended fatal error and do everything in its power to shut down the codebase and alert the user without killing the API as a whole. Airbag will  catch exceptions from... API Main Thread : If initial API startup would immediately fail due to a logic error such as a bad reference, Airbag will deploy. on(type, handler) : This API function is shadowed to allow Airbag to wrap the asynchronous handler code. setTimeout() : functions scheduled with setTimeout will be handled by Airbag. Airbag will not  catch exceptions from... Infinite Loops : No support is planned at this time.  Halting on while(true) before API crash would require shadowing the while keyword. Scheduled Functions (other than setTimeout) : Support is planned (but is not yet implemented) for many scheduling functions, specifically setInterval() , _.delay() , and  _.defer() .  I am open to supporting other functions, but this effort has diminishing returns. Asynchronous Get/Set : Things like reading from the gmnotes section are functions of objects that Airbag might not have access to or get the chance to shadow.  If someone can figure out a safe way to do this, even in select cases, I'm all ears, but I'm skeptical. Operation Should an exception occur while Airbag is installed, Airbag will catch the exception, dump the message and stack trace to the console log and chat log, and finally prompt the GM to restart the API at their leisure with a chat button. In chat, you'll specifically see three things: SRC : Airbag attempts to ascertain what source file and local line number actually threw the exception.  (This may be inaccurate if the offending script or the one before it is minified.)  This specific item will also not display unless the offending script is Marked (see below for developers). MSG : This is the exception message. STK : The stack trace, which is printed in global line numbers. Code To run Airbag, you must install both scripts.  AirbagStart MUST be the very first script installed in a game (unfortunately, this means uninstalling and reinstalling all your existing scripts if you already have some, or at minimum prepending AirbagStart to your first script if you have the source).  Similarly, AirbagEnd MUST be the very last script installed.  This allows them to wrap the rest of your scripts.  If you have scripts that are outside the Airbag sandwich, Airbag will not be able to catch the exceptions they throw. Source Changelog 1.0: Initial Release 1.1: Add on() support 1.2: Fix duplicate Airbag on() registration 1.3 Add setTimeout and clearTimeout handling, add line localization For Developers Utility Functions Airbag supports some functions to help your development. // Call on the very first line of your file (even before boilerplate) void MarkStart(string scriptName) // Call on the last meaningful line of your file (whitespace afterwards won't hurt it, but isn't recommended) void MarkStop(string scriptName) // Converts a global line number into a local line number for a script // Returns: {string Name, int Line} where Name is the name of the script it is from and Line is the local line number within that script. // Requires: Your script must be Marked with MarkStart and MarkStop. obj ConvertGlobalLineToLocal(int globalLineNumber) By Marking your file, Airbag will know where your file starts and stops, meaning it will be able to mark your file as the source of exceptions and even tell you the line number in your file .  In case you don't trust Airbag to be installed, you can always do something like... if (MarkStart) MarkStart('MyScriptName'); What does Airbag Deployment Do? When Airbag detects a fatal error, it performs the following operations in order: The codebaseRunning  internal flag is set to false .  (All shadowed functions check this first, so any shadowed functions called after this point will do nothing.) All registrations by other scripts to the on() function are purged. All timeouts are purged. globalconfig is set to a blank object. Error is logged. User is alerted and prompted with a [Reboot] button. Future Development Current plans for the future are most immediately the remaining schedule functions, but I am open to providing new development tools if they are popular.  I would also like to expand the line number conversion system to include the full stack trace and even provide guesses when it detects an error outside a Marked script.
1552606274
GiGs
Pro
Sheet Author
API Scripter
Interesting idea. Are scripts always loaded in the order they have been installed?
1552610827

Edited 1552617948
GM Michael
API Scripter
Seem to be.  In my testing, I had 6 scripts running and they executed in the order they were installed despite that order being neither alpha nor anti-alpha.  Setting scripts to active/inactive once installed did not affect the order, so it really does appear to simply iterate over them in installation order.  Theoretically, you could probably change the order up a bit with hoisting, but I do not believe JS has any means by which a standalone-compilable script could escape the Airbag sandwich.
1552659325
The Aaron
Roll20 Production Team
API Scripter
That is a neat idea, particularly the restart part.
Its causing my sandbox to crash
1552682085

Edited 1552682191
GM Michael
API Scripter
How odd...   I guess search your scripts for .N ?  Maybe something's calling eval?  I can't imagine what else it might be that would do that.  Or add me as a GM to your game, maybe and I can try to take a look?
1552682891
The Aaron
Roll20 Production Team
API Scripter
The only thing in the repo with .N is DLEllipseDrawer: //treehugger.js //Author: Tim Matchen /*Use: Simply draw an ellipse on the dynamic lighting layer and the script replaces the ellipse with a n-sided polygon approximating the ellipse. The default number of sides is 20; this can be adjusted using the command !treehugger n, where n is the desired number of sides. For example, !treehugger 10 would generate 10-sided polygons instead of 20.*/ on("ready",function(){ var gc = globalconfig && globalconfig.dlellipsedrawer; if(isNaN(gc .N ) != 1){ var n = Math.ceil(gc .N ); } else{ var n = 20; log("Invalid input from globalconfig! Using n = 20") } log("Treehugger is up and running!") on("add:path",function(obj){ /* ... */
The Aaron said: The only thing in the repo with .N is DLEllipseDrawer: //treehugger.js //Author: Tim Matchen /*Use: Simply draw an ellipse on the dynamic lighting layer and the script replaces the ellipse with a n-sided polygon approximating the ellipse. The default number of sides is 20; this can be adjusted using the command !treehugger n, where n is the desired number of sides. For example, !treehugger 10 would generate 10-sided polygons instead of 20.*/ on("ready",function(){ var gc = globalconfig && globalconfig.dlellipsedrawer; if(isNaN(gc .N ) != 1){ var n = Math.ceil(gc .N ); } else{ var n = 20; log("Invalid input from globalconfig! Using n = 20") } log("Treehugger is up and running!") on("add:path",function(obj){ /* ... */ Yea that was the culprit thank you
1552684363

Edited 1552689689
GM Michael
API Scripter
Update : nevermind.  Got PM'd about it. Airbag is fine.  The issue is unrelated. Original post... How would that crash out Airbag though? Having said that, that's such a weird way to check to see if...  Honestly, I don't even know what  that's trying to do. var gc = globalconfig && globalconfig . dlellipsedrawer ;//This will try to get .dlellipsedrawer, but I can't find a definition for // that anywhere. So this'll be falsy. if ( isNaN ( gc . N ) != 1 ){// .N won't exist. var n = Math . ceil ( gc . N );// this is just setting a pointless temp variable to something that'll crash. }
1552760347
Jakob
Sheet Author
API Scripter
Well, isNaN(undefined) is true, and true == 1, so this thing won't execute ... ah, this is JavaScript art :D:D:D.
1552931420

Edited 1552932433
Ammo
Pro
EllipseDrawer is trying to see if globalconfig.dlelllipsedrawer.N is configured (i.e. not undefined and a valid number) to use as the value of 'n' and otherwise set it to 20.   It crashes because it does not check if globalconfig.dlellipsedrawer actually exists and it does not.  Hence this ends up being '<undefined>.N'  It isn't a temp variable, since var variables aren't local in JavaScript.   I assume the value is used further down as the default value for the command lne argument 'n'.   Sandwich is a cool idea.  Do exceptions thrown in event handlers on(... , ...) bubble up back to the block where the => arrow function is defined in JavaScript?   
1552934765
keithcurtis
Forum Champion
Marketplace Creator
API Scripter
Quoting the Aaron from another thread, because this is just such a useful idea for installation: The Aaron  said: Side note, you can probably modify the first script tab you have installed into the start for api-crash-handler and just append the replaced script to the end before adding the end part of api-crash-handler.  Might be much less effort. =D
1553081575

Edited 1553137492
GM Michael
API Scripter
keithcurtis said: Quoting the Aaron from another thread, because this is just such a useful idea for installation: The Aaron  said: Side note, you can probably modify the first script tab you have installed into the start for api-crash-handler and just append the replaced script to the end before adding the end part of api-crash-handler.  Might be much less effort. =D That's a good idea!  Thanks!
1553138714

Edited 1553139017
GM Michael
API Scripter
Ammo said: [snip] Sandwich is a cool idea.  Do exceptions thrown in event handlers on(... , ...) bubble up back to the block where the => arrow function is defined in JavaScript?    After testing, it seems that event-driven functions aren't going to trigger the airbag.  They just directly crash the API because they don't go through the normal execution callstack.  :( About the only way around that would be to implement some sort of event registration system that would pass the data around to any scripts that registered with airbag, but that requires developers to hook into it themselves.  The goal here was to avoid doing something like that, but it looks like it might be the only way... Idk, maybe someone who knows more about js than I can think of some weird js quirk to do it, but I can't, so maybe some future version of airbag will support developers doing something like... airRegister('chat:message', (msg) => { // do stuff that could be unsafe }); The ultimate goal would be to minimize the effort on the author's part to encourage use of Airbag over the stock on() function.
1553139175
The Aaron
Roll20 Production Team
API Scripter
Since your airbag creates a new scope around all the scripts, you could provide your own on() function that shadows the global one and seamlessly passes the registration through with a try/catch decorator wrapping it. 
1553139303
The Aaron
Roll20 Production Team
API Scripter
You’ll probably also want to provide new versions of setTimeout(), setInterval(), _.delay(), and _.defer(). 
1553168617

Edited 1553168867
GM Michael
API Scripter
Good idea! *8 hours of mostly sleep later* V1.1 should have an operational shadow for on() .  I'll add scheduling next.
As long as you are being clear that you are only trying to catch SOME errors, then that's fine.   There are other asynchronous operations that you will not know about.   For example, in the Roll20 API there are asynchronous reads (like gmnotes on journal entries) that you won't be able to wrap.  Also, more advanced scripts can just choose to do things asynchronously, via Promise or otherwise.   It is probably ok that you don't catch these, because developers that use them are probably capable of deciding to catch their own errors if they want to.  Just clarifying that this can't ever be all of them. I am more worried about the idea of restart.   If you restart a script in the same script host, won't you get duplicate handlers for all the events?   Won't every event now get handled twice (and then three times, etc?)     I applaud what you are trying to do, and have at it as long as it is fun. :)   But a long term solution to this problem would be to petition Roll20 for a built-in GM command like "!roll20_api_restart" that actually tears down the script host (the software running the scripts) and starts it back up again, like restarting from the console.  
1553180830
The Aaron
Roll20 Production Team
API Scripter
That double subscribe is a great point. Maybe aiming more for notification is a better goal. 
1553231077

Edited 1553231119
GM Michael
API Scripter
Double-registration should be fixed.  Scheduling will use a similar system. Also, I updated the OP to be more clear as to what Airbag can and can't do.
Michael G. said: Double-registration should be fixed.&nbsp; Scheduling will use a similar system. Also, I updated the OP to be more clear as to what Airbag can and can't do. Looks good! &nbsp; Also, you write very nice code. &nbsp; I mean, except for the fact that it is in JavaScript :) :) For increased usefulness, you and The Aaron (because people listen to him) could ask Roll20 nicely for a simple API function that allows you to ask for an array containing the script host's starting line numbers for loaded scripts. &nbsp; &nbsp;Then you could translate the line numbers in crashes without making every script have to implement that themselves. &nbsp; It would mean we don't have to do this nonsense: <a href="https://github.com/derammo/der20/blob/master/include/header.js.txt" rel="nofollow">https://github.com/derammo/der20/blob/master/include/header.js.txt</a> <a href="https://github.com/derammo/der20/blob/947375ee8c38ba0554f6a4220b964e72cc2f8902/src/der20/plugin/main.ts#L344" rel="nofollow">https://github.com/derammo/der20/blob/947375ee8c38ba0554f6a4220b964e72cc2f8902/src/der20/plugin/main.ts#L344</a> It would be maximum bang for the buck to do this in one place here. &nbsp; Roll20 can't do it for you, even if they ever implement line number translation on unhandled exceptions in the script host, because you no longer allow the exceptions to fly out of script host. &nbsp; That means the only place this can happen now is your exception handler.
1553279526

Edited 1553279548
GM Michael
API Scripter
Oh, if the Roll20 team just implemented chat alerts of API failure, Airbag wouldn't exist. Having said that, having a line number grabber would be handy.&nbsp; Airbag could have two functions that scripts dependent on it could use: MarkScriptStart(scriptName) and MarkScriptStop(scriptName).&nbsp; You'd just call them on the first and last line of your script and it would alert Airbag what line numbers your script occupies, so it can then point to it if something goes wrong.&nbsp; If something was reported outside that range, is it possible for an API script to read the API js file?&nbsp; If nothing else, reporting the lines around the error would be useful.&nbsp; If some installed scripts were using the mark functions, it could provide some bounds on which script failed, even the falling script wasn't using the mark functions itself. Also, because all this would require line number conversion functions, other scripts could just use those for their own internal error reporting. ...I get the feeling that Airbag is about to balloon into a debugging suite because now my mind is going to breakpoints and how to implement them...
1553404178

Edited 1553447627
GM Michael
API Scripter
Further Discoveries So in my experimenting today, I discovered that roll20 directly concatenates files.&nbsp; That is to say, there's no line break between them.&nbsp; That means that any minified js files are going to be a single line, and if you have multiple adjacent files that are minified, Airbag won't be able to tell them apart.&nbsp; I suppose it can just warn the user of such things, but still, that's rather disappointing. Also, this has rather horrifying implications for if someone has a comment on the last line of their script.&nbsp; O.o. (reported as a bug here ) 1.3.2 Update In other news, 1.3 is out: setTimeout() is now protected by Airbag and localized line numbers can now be detected.&nbsp; To set this up for your script, call MarkStart('MyScriptName') and MarkStop('MyScriptName') on the first and last lines of your script. Development Prospects Also, current thought for breakpoints...&nbsp; I'm thinking something like... // Dumps state, this, and globalconfig to chat, then suspends this thread's operations until it's over Promise Breakpoint(conditional, [objects, to, dump...]) // ex: // await Breakpoint(true, myVar, myOtherVar); I'm thinking the user would then get a chat log of the object tree structure with the ability to expand nodes.&nbsp; state , this , and globalconfig would always be included, but if a user provided other objects or variables, those would be dumped too.&nbsp; Of course, because this has to arrest the current thread, that means we need to use Promises, which means your whole function now needs to be async, which might be more trouble than it's worth, but hopefully not! Relatedly, I was thinking about trying to shield promises.&nbsp; In principle, you could just add a catch block after every then &nbsp;or catch block added by the user, but the redirection gets really weird really fast on this one, so I don't see myself ever supporting them.
ok I have a gross suggestion: &nbsp; - hook some function that doesn't do anything harmful (like log(...))&nbsp; - detect specific arguments passed, and use these as MarkScriptStart and MarkScriptEnd =&gt; now scripts that support this don't require Airbag to run
1553528290

Edited 1553528315
GM Michael
API Scripter
I don't see how that's a whole lot better than just checking for MarkStart's existence in-line.&nbsp; Plus, whatever I'd shadow to do that would take a performance hit from the string comparison.&nbsp; It's clever, but I don't think it's the right way to do it.&nbsp; I'm not opposed to doing something like that in the future if I were to add some debugging functions though.
I did &nbsp;say that it was a "gross" suggestion. &nbsp; As in, it would make me vomit a little to write something like that. &nbsp;That said, you do need to create an easy way for people to support Airbag only optionally. &nbsp;You don't want to have a situation where people can't use existing scripts without it. &nbsp; Checking for existence and then calling the function in a one-liner is probably the best you can do? &nbsp;&nbsp;
Btw, I test my non-trivial code offline in Node.js. &nbsp;So do some of the other devs here who write complicated stuff. If I had any trouble debugging, I would probably focus on making mock20 better as a test platform, instead of trying to create the ability to debug in the sandbox itself. &nbsp; Have you worked with that? &nbsp;I don't use it myself, because I use TypeScript and can compile my classes separately in unit tests, but I feel like mock20 + a real debugger would make you happier.