@q bugs.w@> @q% Copyright Dave Bone 1998 - 2015@> @q% /*@> @q% This Source Code Form is subject to the terms of the Mozilla Public@> @q% License, v. 2.0. If a copy of the MPL was not distributed with this@> @q% file, You can obtain one at http://mozilla.org/MPL/2.0/.@> @q% */@> @** Bugs --- ugh. Do i make them? Sure do! They have all the makings of Darwinism with sloppiness due to work overload and not enough self evaluation ... Forget the platitudes Dude. Hey it's Dave. The testsuite raised 3 basic programming misconceptions that should enlighten u the reader in your use of Yacco2.\fbreak \ptindent{1) posting of errors within a thread versus a standalone grammar} \ptindent{2) off by xxx in error token co-ordinates} \ptindent{3) refining of lookahead expression for common prefix recognition} @ Posting of errors within a grammar.\fbreak There are 2 ways to post an error within a grammar:\fbreak \ptindent{1) |ADD_TOKEN_TO_ERROR_QUEUE| macro or parser's |add_token_to_error_queue| procedure} \ptindent{2) |RSVP| macro to post the error or parser's |add_token_to_producer| procedure} The |ADD_TOKEN_TO_ERROR_QUEUE| route should be used only within a standalone grammar. Why? Due to parallelism (nondeterminism), threads can misfire but the overall parsing is fine because some thread will accept its parse or the calling grammar that set things in motion also uses the conditional parsing features of shifting over reduce. Putting an error token into the |accept queue| via |RSVP| allows the calling grammar to discriminate by arbitration as to what token to accept. Then within the grammar's subrule that handles the error token, the syntax-directed code would post the accepted error as an error and possbily shutdown the parse by parser's |set_stop_parse(true)| routine. This allows the grammar writer more flexibitity in error handling. So the moral of this story is ``know your parsing run context'' when posting of errors. So where's my mistake? I used the |RSVP| facility within a standalone grammar. This just shunts the error token into the output token queue instead of the error queue. @ Off by xxx in error token co-ordinates.\fbreak When posting an error, the newly created error token needs to be associated with the source file position. So how is this accomplished? Typically one uses a previous token with established co-ordinates to set the error token's co-ordinates via the parser's |set_rc| routine. So why the off by xxx syndrone? What co-ordinate do u associate an error with when the grammar's subrule uses the wild card facility ${\vert+\vert}$ to catch an error. Is it the |current_token| routine? Not really. Depending on the syntax-directed code context, this can be the lookahead token after the shift operation. That is, it is one past the currently shifted token on the parse stack represented symbolicly by ${\vert+\vert}$. So use the appropriate |Sub_rule_xxx.p1__| parameter in the syntax-directed procedure to reference the stacked token where xxx is the subrule number of this subrule. |p1__| is the first component of the subrule's right-hand-side expression. If there are other components that are to be referenced, |px__| will have the appropriate component number replacing x. There are other facilities open to the grammar writer like the |start_token| routine that provides the starting token passed to the thread for parsing for co-ordinate references. One can also store token references within the grammar rules as parsing takes place so that a ``roll your own'' fiddling within contexts can be programmed: To each their own. @ Refining of lookahead expression for keyword recognition.\fbreak I describe this problem using a concrete example but the problem is generic when there are competing threads that can have common prefixes. If a keyword is ``emitfile'' then should it accept ``emitfilex'' when the balance ``x'' follows? No it should not. This is the common prefix problem where the lookahead boundary to accept or abort the parse was wrongly programmed. As theads can fine tune their lookahead expression by the grammar's ``parallel-la-boundary'' statement, the correction is to properly program this expression. See the |yacco2_linker_keywords.lex| grammar for the proper lookahead expression to abort such a parse. @ Take 2: Teaching the teacher --- why off by xxx in error co-ordinates?.\fbreak In testing linker's grammars, the following came up. What co-ordinate does one associate with an error token when the ``eog'' token produces the error? Remember, the meta-terminals like eog, ${\vert.\vert}$, ${\vert+\vert}$ have no co-ordinates associated with them as they are shared across all token containers. Quick review of co-ordinates: There are 4 parts to a token's co-ordinates:\fbreak \ptindent{1) a file number kept by Yacco2 giving the filename source of token} \ptindent{2) line number within the file} \ptindent{3) character position on the line} \ptindent{4) character position within the physical file returned by the I/O routine} Points 1 -- 3 are used for now to print out errors. Well I had to check the source code of |set_rc| procedure. It was programmed properly and had comments to boot describing the problem. |set_rc| marches back thru the token container looking for a token having physical co-ordinates. Call the number of moves thru the container the displacement. This displacement is summed with the character position of the referenced token to point to a spot on the source line. The other co-ordinate attributes are just copied. Depending on the token's length in display characters, the displacement will be xxx moves to the right of the referenced token's co-ordinates. This is why the error token graphic ${\uparrow}$ is offset within the error message displaying against the source line and character spot. Depending on the length of the token in display characters, the offset could be within the interior of the token or to the right of it. This offset shows that a potential overflow was prevented when the ``eog'' token halts the parse. Have a look at testsuite's ``no end-T-alphabet keyword'' or the last test ''end-list-of-transitive-threads keyword not present '' illustrating this type of situation. What about the other meta-terminals? This is a rool-your-own-laddie. Grab whatever context u want to associate the co-ordinates against the error token. In the case of the ${\vert.\vert}$, the meta-epsilon token, u can use the |current_token()| as its co-ordinate context. Well go tell it to the Yacco2 parse library. See |set_rc| writeup. The moral of this tidbit? A user manual must be written. I know but this will be my next venture.\fbreak 4 May 2005. @ Missed associating the root node thread to its called threads' terminals.\fbreak This shows how the forest and the trees (pun intended) got blurred. The scoping of a screen just does not allow one to view well a general perspective: batteries not required nor a camel to carry paper and a caf\'e au lait. Now the 2 errors:\fbreak The |visit_graph| reserved the right amount of space but I mistakenly thought that the container's size procedure gave the number of elements reserved --- nope. So the |visit_graph|'s check on visited became true immediately. 2nd mistake had a compound logic error: the root node thread was not associated with the called threads' first sets --- which is the raison d'\^etre for all this recursion. The more subtle error was checking in the global terminal's set for the called thread existance. If it was present, the root node thread was bypassed in checking whether it should be entered into the global terminal set. Boy sometimes you're dump Dave --- spelling correct at the time: haha. This error should have been caught in the test suites. It requires a proper set of faked grammars to test out transitive closure: i.e., what about nested calling grammars?, various flavors of epsilon: a complete pass thru grammar, ${\vert.\vert}$ to boundary detection, etc. Plain and simple I was procastinating as real work on other things is building. The scaffolding for these other suites requires a bit more thought and much more effort. Ahhh self analysis watching the watched watching --- I leave this to Freud but it still cost 3 hours of time lost to ferret out these bloopers. Okay I'm crafting more test suites to cruise-control these esoteric conditions. It demonstrates the requirement of well calibrated test suites as a necessary dimension to programming along with comments as in literate programming, and pre / post constraint declarations within code to catch realtime strange-ites. Without these tested dimensions, what assurances or confidence does one have to say that the code is correct? Not much but hand wavings and hot ... Let's hear it for QA and its exercising regime to come out and strut itself.\fbreak 11 May 2005. @ Dynamic reserve space for |Visit_graph|.\fbreak The |std::vector Visit_graph();| was originally calculating the number of threads before reserving space via ``reserve(xxx)'' procedure. This is a simple way to associate a visited node for my recursive graph walks: stop those revisits. Somehow this space was getting reallocated and destroying my trees. So out damm dynamics and in with static array no template please...\fbreak Nov. 2007\fbreak