| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Reviewed-by: Carl Worth <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In commit 6005e9cb28 a new start state of NEWLINE_CATCHUP was added to the
lexer. This start state is used whenever the lexer is emitting a NEWLINE token
to emit additional NEWLINE tokens for any newline characters that were skipped
by an immediately preceding multi-line comment.
However, that commit erroneously entered the NEWLINE_CATCHUP state for
single-line comments. This is not desired since in the case of a single-line
comment, the lexer is not emitting any NEWLINE token. The result is that the
lexer will remain in the NEWLINE_CATCHUP state and proceed to fail to emit a
NEWLINE token for the subsequent newline character, (since the case to match \n expects only the INITIAL start state).
The fix is quite simple, remove the "BEGIN NEWLINE_CATCHUP" code from the
single-line comment case, (preserving it only in exactly the cases where the
lexer is actually emitting a NEWLINE token).
Many thanks to Petri Latvala for reporting this bug and for providing the
minimal test case to exercise it. The bug showed up only with a multi-line
comment which was followed immediately by a single-line comment (without any
intervening newline), such as:
/*
*/ // Kablam!
Since 6005e9cb28, and before this commit, that very innocent-looking
combination of comments would yield a parse failure in the compiler.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72686
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
definition)
The preprocessor has always replaced multi-line comments with a single space
character, (as required by the specification), but as of commit
bd55ba568b301d0f764cd1ca015e84e1ae932c8b the lexer also emitted a NEWLINE
token for each newline within the comment, (in order to preserve line
numbers).
The emitting of NEWLINE tokens within the comment broke the rule of "replace a
multi-line comment with a single space" as could be exposed by code like the
following:
#define FOO a/*
*/b
FOO
Prior to commit bd55ba568b301d0f764cd1ca015e84e1ae932c8b, this code defined
the macro FOO as "a b" as desired. Since that commit, this code instead
defines FOO as "a" and leaves a stray "b" in the output.
In this commit, we fix this by not emitting the NEWLINE tokens while lexing
the comment, but instead merely counting them in the commented_newlines
variable. Then, when the lexer next encounters a non-commented newline it
switches to a NEWLINE_CATCHUP state to emit as many NEWLINE tokens as
necessary (so that subsequent parsing stages still generate correct line
numbers).
Of course, it would have been more clear if we could have written a loop to
emit all the newlines, but flex conventions prevent that, (we must use
"return" for each token we emit).
It similarly would have been clear to have a new rule restricted to the
<NEWLINE_CATCHUP> state with an action much like the body of this if
condition. The problem with that is that this rule must not consume any
characters. It might be possible to write a rule that matches a single
lookahead of any character, but then we would also need an additional rule to
ensure for the <EOF> case where there are no additional characters available
for the lookahead to match.
Given those considerations, and given that the SKIP-state manipulation already
involves a code block at the top of the lexer function, before any rules, it
seems best to me to go with the implementation here which adds a similar
pre-rule code block for the NEWLINE_CATCHUP.
Finally, this commit also changes the expected output of a few, existing glcpp
tests. The change here is that the space character resulting from the
multi-line comment is now emitted before the newlines corresponding to that
comment. (Previously, the newlines were emitted first, and the space character
afterward.)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72686
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Two things make this code confusing:
1. The uncharacteristic manipulation of lexer start state outside of
flex rules.
2. The confusing semantics of the skip_stack (including the
"lexing_if" override and the SKIP_NO_SKIP state).
This new comment is intended to bring a bit more clarity for any readers.
There is no intended beahvioral change to the code here. The actual code
changes include better indentation to avoid an excessively-long line, and
using the more descriptive INITIAL rather than 0.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The GLSL ES 3.0 spec (Section 12.17) says:
"GLSL ES 1.00 removed token pasting and other functionality."
NOTE: This is a candidate for the stable branches.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Carl Worth <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
And add test cases to ensure that this works
- 110 verifies that glcpp rejects #elif<digits> which glcpp
previously accepted.
- 111 verifies that glcpp accepts #if followed immediately by
(, +, -, !, or ~.
- 112 does the same as 111 but for #elif.
See 17f9beb6 for #if change.
Reviewed-by: Carl Worth <[email protected]>
|
|
|
|
|
| |
Fixes part of es3conform's preprocess16_frag test.
Reviewed-by: Carl Worth <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we used lookahead patterns to differentiate:
#define FOO(x) function macro
#define FOO (x) object macro
Unfortunately, our rule for function macros:
{HASH}define{HSPACE}+/{IDENTIFIER}"("
relies on infinite lookahead, and apparently triggers a Flex bug where
the generated code overflows a state buffer (see YY_STATE_BUF_SIZE).
There's no need to use infinite lookahead. We can simply change state,
match the identifier, and use a single character lookahead for the '('.
This apparently makes Flex not generate the giant state array, which
avoids the buffer overflow, and should be more efficient anyway.
Fixes piglit test 17000-consecutive-chars-identifier.frag.
NOTE: This is a candidate for every release branch ever.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Carl Worth <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The GLSL specification requires that #line directives be interpreted
after macro expansion. Our existing implementation of #line macros in
the lexer prevents conformance on this point.
Moving the handling of #line from the lexer to the parser gives us the
macro expansion we need. An additional benefit is that the
preprocessor also now supports comments on the same line as #line
directives.
Finally, the preprocessor now emits the (fully-macro-expanded) #line
directives into the output. This allows the full GLSL compiler to also
see and interpret these directives so it can also generate correct
line numbers in error messages.
Signed-off-by: Carl Worth <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The trick here is that flex always chooses the rule that matches the most
text. So with a input text of "two:" which we want to be lexed as an
IDENTIFIER token "two" followed by an OTHER token ":" the previous OTHER
rule would match longer as a single token of "two:" which we don't want.
We prevent this by forcing the OTHER pattern to never match any
characters that appear in other constructs, (no letters, numbers, #,
_, whitespace, nor any punctuation that appear in CPP operators).
Fixes bug #44764:
GLSL preprocessor doesn't replace defines ending with ":"
https://bugs.freedesktop.org/show_bug.cgi?id=44764
Reviewed-by: Kenneth Graunke <[email protected]>
NOTE: This is a candidate for stable release branches.
|
| |
|
|
|
|
| |
These are now unnecessary.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, the rule deleted by this commit was matched every single
time (being the longest match). If not skipping, it used REJECT to
continue on to the actual correct rule.
The flex manual advises against using REJECT where possible, as it is
one of the most expensive lexer features. So using it on every match
seems undesirable. Perhaps more importantly, it made it necessary for
the #if directive rules to contain a look-ahead pattern to make them
as long as the (now deleted) "skip the whole line" rule.
This patch introduces an exclusive start state, SKIP, to avoid REJECTs.
Each time the lexer is called, the code at the top of the rules section
will run, implicitly switching the state to the correct one.
Fixes piglit tests 16384-consecutive-chars.frag and
16385-consecutive-chars.frag.
|
| |
|
|
|
|
| |
This is necessary for the main compiler to get correct line numbers.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The existing DECIMAL_INTEGER pattern is the correct thing to use when
looking for a C decimal integer, (that is, a digit-sequence not
starting with 0 which would instead be an octal integer).
But for #line, we really want to accept any digit sequence, (including
"0"), and always interpret it as a decimal constant. So we add a new
DIGITS pattern for this case.
This should fix the compilation failure noted in bug #28138
https://bugs.freedesktop.org/show_bug.cgi?id=28138
(Though the generated file will not be updated until the next commit.)
|
|
|
|
|
|
|
| |
Previously, the YY_USER_ACTION was overwriting the yylloc->source value
in every action, (after that value had been carefully set by the handling
of the #line directive). Instead, we want to initialize it once in
YY_USER_INIT and then not touch it at all in YY_USER_ACTION.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Matching the newline here meant having to do some redundant work here,
(incrementing line number, resetting column number, and returning a
NEWLINE token), that could otherwise simply be left to the existing rule
which matches a newline.
Worse, when the comment rule matches the newline as well, the parser
can lookahead and see a token for something that should actually be skipped.
For example, in a case like this:
#if 0 // comment here
fail
#else
win
#endif
Both fail and win appear in the output, (not that the condition is being
evaluated incorrectly---merely that one token after the comment's newline
was being lexed/parse regardless of the condition).
This commit fixes the above test case, (which is also remarkably similar
to 087-if-comments which now passes).
|
|
|
|
|
|
| |
This was causing line numbering to be off by one. The newline comes
from the NEWLINE token at the end of the line; there's no need to
insert one.
|
|
|
|
|
| |
Also remove the --never-interactive command line option for the
preprocessor lexer. This was already done for main compiler lexer.
|
|
|
|
|
| |
Error messages make more sense this way since the convention is for
the first line of a file to be numbered from 1, rather than 0.
|
|
|
|
|
|
| |
Calling exit() on a memory failure probably made sense for the
standalone preprocessor, but doesn't seem too appealing as part of
the GL library. Also, we don't use it in the main compiler.
|
|
|
|
|
|
|
| |
Fixes:
glsl-version-define
glsl-version-define-110
glsl-version-define-120
|
|
|
|
|
|
|
|
|
|
|
| |
We define the YY_NO_INPUT macro to avoid one needless function being
generated.
for the other needless functions, (yyunput and yy_top_state), we add a
new UNREACHABLE start condition and call these functions from an
action there. This doesn't change functionality at all, (since we
never enter the UNREACHABLE start condition), but makes the compiler
stop complaining about these two functions being defined but not used.
|
|
|
|
|
|
| |
It's really a bug in flex that these functions are generated with neither
a declaration nor the 'static' keyword, but we can at least avoid the
warnings this way.
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, if the outer #ifdef/#ifndef evaluated to false, the inner
directive would not be parsed correctly, (the identifier as the subject
of the #ifdef/#ifndef would inadvertently be skipped along with the other
content correctly being skipped).
We fix this by setting the lexing_if state in each case here.
We also add a new test to the test suite to ensure that this case is tested.
|
|
|
|
| |
And add a test case to ensure that this works.
|
| |
|
| |
|
|
|