Roll20 uses cookies to improve your experience on our site. Cookies enable you to enjoy certain features, social sharing functionality, and tailor message and display ads to your interests on our site and others. They also help us understand how our site is being used. By continuing to use our site, you consent to our use of cookies. Update your cookie preferences .
×
Create a free account

Parsing GMNotes from Token

1614777584
Nick O.
Forum Champion
I'm trying to parse the contents of a token's GM Notes field to find a particular line of text.      var gmNotes = npcToken.get("gmnotes");     var fullText = decodeURIComponent(gmNotes);    var allLines = fullText.split("<p>");     allLines.some(function(line){         if (line.startsWith("TargetText")){            //dostuff         }     }); This works, but it isn't sitting right with me. Splitting on the paragraph tag feels wrong, but I'm not sure of a more efficient/cleaner way to do it. If anyone has any ideas, I'd appreciate it. :)
1614790291

Edited 1614790533
timmaugh
Pro
API Scripter
Are you concerned that you might not catch all of the lines? If so, you could split on a regex of all of the style demarcations, capturing the type in groups: let allLines = []; fullText.replace(/(<(p|h1|h2|h3|h4|h5|h6)>(.*)<\/\2>)/gmi,(m,g1,g2,g3) => {     allLines.push({ type: g2, text: g3 });     return m; }); (air-coded, so normal testing caveats apply) Purists will dispute leveraging the replace operation this way (as a forEach that can quickly access the regex groups), so if that bothers you, you can rewrite it with as a split => forEach => regex exec => push, but this should work. In the end, you should have an array of tokens with a type property corresponding to the tag and a text property corresponding to the interior text. That the sort of thing you were looking for?
1614791245

Edited 1614792190
timmaugh
Pro
API Scripter
Alternatively, you might be able to use a regex that asserts the start of line but with the multi-line flag to quickly locate this one piece of text... let targetrx = /^<(p|h1|h2|h3|h4|h5|h6)>(Target Text.*)<\/\1>/mi; let theLineIWant; if(targetrx.test(fullText) theLineIWant = targetrx.exec(fullText)[2]; log(theLineIWant); You might have to escape the target text if it might have regex codes in it: const escapeRegExp = (string) => { return string.replace(/[.*+\-?^${}()|[\]\\]/g, '\\$&'); }; let escTgt = escapeRegExp(`Target(Text) with /*codes*/`); let targetrx = new RegExp(`^<(p|h1|h2|h3|h4|h5|h6)>(${escTgt}.*)<\\/\\1>`,'mi'); But you'll want to test that regex... I always have to double check escaping the slashes so that they show up in the string. =D
1614862575
Nick O.
Forum Champion
Thanks, Timmaugh. I like the second approach, but I'm not sure I understand exactly how the regex is working, so please bear with me. It looks like it's creating a pattern that should match any html tag, then the string I'm searching for. What does the <\/\1> bit do? Also, in the if statement, targetrx.exec(fullText)[2]; is executing the regex, and then element #0 will be the full string, element #1 will be the tag, and element 2 is the text that was being searched for?
1614865656

Edited 1614865804
timmaugh
Pro
API Scripter
Hey, Nick... yep, you're right about the groups and how they will be numbered. Group 0 is the full match, and group 1 is the tag type. The <\/\1> bit is using a way to refer to capture groups while still in the regex. The same as you can use $1 to refer to the first capture group in js in a replace operation, you can use \1 to refer to it in a regex. So if the regex is trigger on a <p> tag, the first group (not the zeroth group) will be 'p'. That means we're looking for </p> to close the capture group of the text we're looking for. </p> ...becomes... </\1> ...that needs to be escaped, so it becomes... <\/\1> Last... I'm sure you know about escaping the slashes in a string, so more for others who might be reading: if you have to use it as an actual string to supply a `new RegExp(...)` function, you have to escape those slashes again, since it will exist as a string before it is the source of the regex, and in that interregnum those regex-escaping slashes will become string-character-escaping slashes and disappear before it reaches the regex. That's where the extra slashes come in for this line: let targetrx = new RegExp(`^<(p|h1|h2|h3|h4|h5|h6)>(${escTgt}.*) <\\/\\1> `,'mi');
1614896540
Nick O.
Forum Champion
Thanks!