| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We've found that there's a buffer overrun bug in flex that's triggered by
using alternation in a lookahead pattern.
Fortunately, we don't need to match the exact {NEWLINE} expression to
detect an empty pragma. It suffices to verify that there are no non-space
characters before any newline character. So we can use a simple [\r\n] to
get the desired behavior while avoiding the flex bug.
Fixes the regression of piglit's 17000-consecutive-chars-identifier test,
(which has been crashing since commit
04e40fd337a244ee77ef9553985e9398ff0344af ).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82472
Signed-off-by: Carl Worth <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
CC: <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There were two problems with the way this script used sed on OS X:
1. The OS X sed doesn't interpret "\r" in a replacement list as a
carriage-return character, (instead it was inserting a literal
'r' character).
We fix this by putting an actual ^M character into the source of
the script, (rather than a two-character escape sequence hoping
for sed to do the right thing).
2. When generating the test files with LF-CR ("\n\r") newlines, the
OS X sed was adding an undesired final newline ("\n") at the end
of the file. We avoid this by first using sed to add the ^M
before the newlines, then using tr to swap the \r and \n
characters. This way, sed never sees any lines ending with
anything but \n, so it doesn't get confused and doesn't add any
bogus extra newlines.
Tested-by: Vinson Lee <[email protected]>
Vinson's testing confirmed that this patch fixes FreeBSD as well.
|
|
|
|
|
|
|
|
|
|
| |
I noticed that with /bin/sh on Mac OS X, "echo -n" does not work as
desired, (it actually prints "-n" rather than suppressing the final
newline). There is a /bin/echo that could be used (it actually works)
instead of the builtin echo.
But I decided it's more robust to just use printf rather than
hardcoding /bin/echo into the script.
|
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
With two tests both numbered 118, there was a confusing off-by-two difference
between the last test number and the total number of tests (as reported by
glcpp-test).
With this rename, there's only an off-by-one difference left, (which is easy
to understand given the zero-based test numbering).
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Here is some additional stress testing of nested macros where the expansion
of macros involves commas, (and whether those commas are interpreted as
argument separators or not in subsequent function-like macro calls).
Credit to the GCC documentation that directed my attention toward this issue:
https://gcc.gnu.org/onlinedocs/gcc-3.2/cpp/Argument-Prescan.html
Fixing the bug required only removing code from glcpp. When first testing the
details of expansions involving commas, I had come to the mistaken conclusion
that an expanded comma should never be treated as an argument separator, (so
had introduced the rather ugly COMMA_FINAL token to represent this).
In fact, an expanded comma should be treated as a separator, (as tested here),
and this treatment can be avoided by judicious use of parentheses (as also
tested here).
With this simple removal of the COMMA_FINAL token, the behavior of glcpp
matches that of gcc's preprocessor for all of these hairy cases.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Beyond just listing this in the TESTS variable in Makefile.am, only minor
changes were needed to make this work. The primary issue is that the build
system runs the test script from a different directory than the script
itself. So we have to use the $srcdir variable to find the test input files.
Using $srcdir in this way also ensures that this test works when using an
out-of-tree build.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The (optional) test-specific command-line arguments to be passed to glcpp are
embedded within the source files of some tests, and glcpp-test uses grep to
extract them.
Of course, grep is line-based and looks for the native line-separator to
determine line boundaries. So, for files using non-native line separators,
grep was getting quite confused and passing bogus arguments to glcpp.
Fix this by canonical-izing the line separators in the source file prior to
using grep.
With this commit, the glcpp-test-cr-lf tests pass entirely:
\r: 143/143 tests pass
\r\n: 143/143 tests pass
\n\r: 143/143 tests pass
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Sometimes the newline separator is a single character, and sometimes it is two
characters. Before we can fold away and line-continuation backslashes, we
identify the flavor of line separator that is in use.
With this identified, we then correctly search for backslashes followed
immediately by the first character of the line separator.
Also, when re-inserting newlines to replace collapsed newlines, we carefully
insert newlines of the same flavor.
With this commit, almost all remaining test are fixed as tested by
glcpp-test-cr-lf:
\r: 142/143 tests pass
\r\n: 142/143 tests pass
\n\r: 143/143 tests pass
(The only remaining failures have nothing to do with the actual pre-processor
code, but are due to a bug in the way the test suite uses grep to try to
extract test-specific command-line options from the source files.)
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some tests were failing because the message printed by #error was including a
'\r' character from the source file in its output.
This is easily avoided by fixing the regular expression for #error to never
include any of the possible newline characters, (neither '\r' nor '\n').
With this commit 2 tests are fixed for each of the '\r' and '\r\n' cases.
Current results after the commit are:
\r: 137/143 tests pass
\r\n 142/143 tests pass
\n\r: 139/143 tests pass
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The GLSL specification says that either carriage-return, line-feed, or both
together can be used to terminate lines. Further, it says that when used
together, the pair of terminators shall be interpreted as a single line.
This final requirement has not been respected by glcpp up until now, (it has
been emitting two newlines for every CR+LF pair).
Here, we fix the lexer by using a regular expression for NEWLINE that eats
up both "\r\n" (or even "\n\r") if possible before also considering a single
'\n' or a single '\r' as a line terminator.
Before this commit, the test results are as follows:
\r: 135/143 tests pass
\r\n: 4/143 tests pass
\n\r: 4/143 tests pass
After this commit, the test results are as follows:
\r: 135/143 tests pass
\r\n: 140/143 tests pass
\n\r: 139/143 tests pass
So, obviously, a dramatic improvement.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The GLSL specification has a very broad definition of what is a
newline. Namely, it can be the carriage-return character, '\r', the newline
character, '\n', or any combination of the two, (though in combination, the
two are treated as a single newline).
Here, we add a new test-runner, glcpp-test-cr-lf, that, for each possible
line-termination combination, runs through the existing test suite with all
source files modified to use those line-termination characters. Instead of
using the .expected files for this, this script assumes that the regular test
suite has been run already and expects the output to match the .out
files. This avoids getting 4 test failures for any one bug, and instead will
hopefully only report bugs actually related to the line-termination
characters.
The new testing is not yet integrated into "make check". For that, some
munging of the testdir option will be necessary, (to support "make check" with
out-of-tree builds). For now, the scripts can just be run directly by hand.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Prior to this commit, the following snippet would trigger an error in glcpp:
#define FOO defined BAR
#if FOO
#endif
The problem was that support for the "defined" operator was implemented within
the grammar, (where the parser was parsing the tokens of the condition
itself). But what is required is to interpret the "defined" operator that
results after macro expansion is performed.
I could not find any fix for this case by modifying the grammar alone. The
difficulty is that outside of the grammar we already have a recursive function
that performs macro expansion (_glcpp_parser_expand_token_list) and that
function itself must be augmented to be made aware of the semantics of the
"defined" operator.
The reason we can't simply handle "defined" outside of the recursive expansion
function is that not only must we scan for any "defined" operators in the
original condition (before any macro expansion occurs); but at each level of
the recursive expansion, we must again scan the list of tokens resulting from
expansion and handle "defined" before entering the next level of recursion to
further expand macros.
And of course, all of this is context dependent. The evaluation of "defined"
operators must only happen when we are handling preprocessor conditionals,
(#if and #elif) and not when performing any other expansion, (such as in the
main body).
To implement this, we add a new "mode" parameter to all of the expansion
functions to specify whether resulting DEFINED tokens should be evaluated or
ignored.
One side benefit of this change is that an ugly wart in the grammar is
removed. We previously had "conditional_token" and "conditional_tokens"
productions that were basically copies of "pp_token" and "pp_tokens" but with
added productions for the various forms of DEFINED operators. With the new
code here, those ugly copy-and-paste productions are eliminated from the
grammar.
A new "make check" test is added to stress-test the code here.
This commit fixes the following Khronos GLES3 CTS tests:
conditional_inclusion.basic_2_vertex
conditional_inclusion.basic_2_fragment
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we were passing these through, just like any other pragma. But the
downstream compiler was tripping up on them. It seems easier to swallow these
in the preprocessor and not pass them on at all rather than fixing the
downstream compiler.
This fixes the following Khronos GLES3 CTS tests:
preprocessor.pragmas.pragma_vertex
preprocessor.pragmas.pragma_fragment
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, the #pragma directive was swallowing an entire line, (including
the final newline). At that time it was appropriate for it to increment the
line count.
More recently, our handling of #pragma changed to not include the newline. But
the code to increment yylineno stuck around. This was causing __LINE__ to be
increased by one more than desired for every #pragma.
Remove the bogus, extra increment, and add a test for this case.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
This new "make check" test stresses out the support from the last two commits,
(to esnure that '#' is correctly interpreted as the null directives,
regardless of any whitespace or comments on the same line).
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the fix for the following line:
# // comment to ignore here
According to the translation-phase rules, the comment should be removed before
the preprocessor looks to interpret the null directive.
So in our implementation we must explicitly look for single-line comments in
the <HASH> start condition as well.
This commit fixes the following Khronos GLES3 CTS tests:
null_directive_vertex
null_directive_fragment
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
| |
This simply tests the previous commit, (that #define followed by a comment
will still generate the expected "#define without macro name" error message).
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were already correctly supporting single-line comments in case like:
#define FOO bar // comment here...
The new support added here is simply for the none-too-useful:
#define // comment instead of macro name
With this commit, this line will now give the expected "#define without
macro name" error message instead of the lexer just going off into the
weeds.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
This ensures that the previous commit indeed generates the expected error
message when a "#define" directive is not followed by anything except for a
newline.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, glcpp would emit an error like this if <EOF> happened to occur
immediately after the "#define", but in general would just get confused,
(leading to un-helpful error messages).
To fix things to generate a clean error message, we do a few things:
1. Don't require horizontal whitespace immediately after #define
2. Add a production for the error case, (DEFINE_TOKEN followed
immediately by a NEWLINE token).
3. Make the lexer reset to the <INITIAL> state after every NEWLINE.
This 3rd point prevents the lexer from getting so confused and generating
further spurious errors in the file because it was stuck in the <DEFINE> start
condition.
We also drop the similar error message from the <EOF> rule since the
newly-added rule will have already printed the error message.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For a long time, we've wanted a place to put utility code which isn't
directly tied to Mesa or Gallium internals. This patch creates a new
src/util directory for exactly that purpose, and builds the contents as
libmesautil.la.
ralloc seemed like a good first candidate. These days, it's directly
used by mesa/main, i965, i915, and r300g, so keeping it in src/glsl
didn't make much sense.
Signed-off-by: Kenneth Graunke <[email protected]>
v2 (Jason Ekstrand): More realloc uses and some scons fixes
Signed-off-by: Jason Ekstrand <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Define the macro GL_OES_standard_derivatives as 1 if the extension
GL_OES_standard_derivatives is supported.
V2 [Chris]: Correct trailing whitespace
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
| |
ERROR is a #define in the MSVC WinGDI.h header file.
Add the _TOKEN suffix as we do for a few other lexer tokens.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We've had multiple bugs in the past where we have been inadvertently matching
the default rule, (which we never want to do). We recently added a catch-all
rule to avoid this, (and made this rule robust for future start conditions).
Kristian pointed out that flex allows us to go one step better. This syntax:
%option warn nodefault
instructs flex to not generate the default rule at all. Further, flex will
generate a warning at compile time if the set of rules we provide are
inadequate, (such that it would be possible for the default rule to be
matched).
With this warning in place, I found that the catch-all rule was in fact
missing something. The catch-all rule uses a pattern of "." which doesn't
match newlines. So here we extend the newline-matching rule to all start
conditions. That is enough to convince flex that it really doesn't need
any default rule.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Using a single rule here means that we can use the <*> syntax to match
all start conditions. This makes the catch-all rule more robust against
the addition of future start conditions, (no need to maintain an ever-
growing list of start conditions for this rul).
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is no behavioral change here. It's just easier to verify that lists
of start conditions include all expected conditions when they appear in a
consistent order.
The <INITIAL> state is special, so it appears first in all lists. All others
appear in alphabetical order.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In some of the recent glcpp bug-fixing, we found that glcpp was emitting
unrecognized characters from the input source file to stdout, and dropping
them from the source passed onto the compiler proper.
This was obviously confusing, and totally undesired.
The bogus behavior comes from an implicit default rule in flex, which is
that any unmatched character is implicitly matched and printed to stdout.
To avoid this implicit matching and printing, here we add an explicit
catch-all rule. If this rule ever matches it prints an internal compiler
error. The correct response for any such error is fixing glcpp to handle
the unexpected character in the correct way.
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, the '\r' character was not explicitly matched by any lexer
rule. This means that glcpp would have been using the default flex rule to
match '\r' characters, (where they would have been printed to stdout rather
than actually correctly handled).
With this commit, we treat '\r' as equivalent to '\n'. This is clearly an
improvement the bogus printing to stdout. The resulting behavior is compliant
with the GLSL specification for any source file that uses exclusively '\r' or
'\n' to separate lines.
For shaders that use a multiple-character line separator, (such as "\r\n"),
glcpp won't be precisely compliant with the specification, (treating these as
two newline characters rather than one), but this should not introduce any
semantic changes to the shader programs.
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This test is written to exercise a bug which I recently wrote, (but
fortunately caught and fixed before ever committing it).
For the curious:
The bug happened when the NEWLINE_CATCHUP code didn't actually return the
NEWLINE token (due to the skipping). This resulted in the lexer continuing
on through all the subsequent rules while still in the NEWLINE_CATCHUP start
condition, (which then triggered the internal-compiler-error catch-all
rule).
What is intended is for the return of the NEWLINE token to start a new
iteration of the lexer loop, at which time the NEWLINE_CATCHUP-handling code
will reset from the <NEWLINE_CATCHUP> to the <INITIAL> start condition.
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
At one point while rewriting the lexing rule for pre-processing numbers, I
made it a bit too aggressive and within a replacement list sucked up a
parameter name that appeared immediately after a period. This caused the
parameter name to be unreplaced when the macro was expanded.
It was in some piglit tests that I originally found this issue. Here, I'm
adding a test to "make check" to ensure that this behavior remains correct.
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These operators aren't defined for preprocessor expressions, so we never
implemented them. This led them to be misinterpreted as strings of unary
'+' or '-' operators.
In fact, what is actually desired is to generate an error if these operators
appear in any preprocessor condition.
So this commit looks like it is strictly adding support for these
operators. And it is supporting them as far as passing them through to the
subsequent compiler, (which was already happening anyway).
What's less apparent in the commit is that with these tokens now being lexed,
but with no change to the grammar for preprocessor expressions, these
operators will now trigger errors there.
A new "make check" test is added to verify the desired behavior.
This commit fixes the following Khronos GLES3 CTS test:
invalid_op_1_vertex
invalid_op_1_fragment
invalid_op_2_vertex
invalid_op_2_fragment
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This will emit an error for something like:
#define FOO(x,x) ...
Obviously, it's not a legal thing to do, and it's easy to check.
Add a "make check" test for this as well.
This fixes the following Khronos GLES3 CTS tests:
invalid_function_definitions.unique_param_name_vertex
invalid_function_definitions.unique_param_name_fragment
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Just reading the code, it looked like a bug that _define_object_macro had this
check, but _define_function_macro did not. Upon further reading, that's
because the check is to allow for our builtins to be defined, (and there are
no builtin function-like macros).
Add my new understanding as a comment to help the next reader.
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we had a single token for "#if" but now that we have two separate
tokens, it looks much better to see:
HASH_TOKEN IF
than:
HASH_TOKEN HASH_IF
(Note, that for the same reason we use HASH_TOKEN instead of HASH, we also use
DEFINE_TOKEN instead of DEFINE to avoid a conflict with the <DEFINE> start
condition in the lexer.)
There should be no behavioral change from this commit.
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It's legal (though highly bizarre) for a pre-processor directive to look like
this:
# /* why? */ define FOO bar
This behavior comes about since the specification defines separate logical
phases in a precise order, and comment-removal occurs in a phase before the
identification of directives.
Our implementation does not use an actual separate phase for comment removal,
so some extra care is necessary to correctly parse this. What we want is for
'#' to introduce a directive iff it is the first token on a line, (ignoring
whitespace and comments). Previously, we had a lexical rule that worked only
for whitespace (not comments) with the following regular expression to find a
directive-introducing '#' at the beginning of a line:
HASH ^{HSPACE}*#{HSPACE}*
In this commit, we switch to instead use a simple literal match of '#' to
return a HASH_TOKEN token and add a new <HASH> start condition for whenever
the HASH_TOKEN is the first non-space token of a line. This requires the
addition of the new bit of state: first_non_space_token_this_line.
This approach has a couple of implications on the glcpp parser:
1. The parser now sees two separate tokens, (such as HASH_TOKEN and
HASH_DEFINE) where it previously saw one token (HASH_DEFINE) for
the sequence "#define". This is a straightforward change throughout
the grammar.
2. The parser may now see a SPACE token before the HASH_TOKEN token of
a directive. Previously the lexical regular expression for {HASH}
would eat up the space and there would be no SPACE token.
This second implication is a bit of a nuisance for the parser. It causes a
SPACE token to appear in a production of the grammar with the following two
definitions of a control_line:
control_line
SPACE control_line
This is really ugly, since normally a space would simply be a token
separator, so it wouldn't appear in the tokens of a production. This leads to
a further problem with interleaved spaces and comments:
/* ... */ /* ... */ #define /* ..*/
For this, we must not return several consecutive SPACE tokens, or else we would need an arbitrary number of new productions:
SPACE SPACE control_line
SPACE SPACE SPACE control_line
ad nauseam
To avoid this problem, in this commit we also change the lexer to emit only a
single SPACE token for any series of consecutive spaces, (whether from actual
whitespace or comments). For this compression, we add a new bit of parser
state: last_token_was_space. And we also update the expected results of all
necessary test cases for the new compression of space tokens.
Fortunately, the compression of spaces should not lead to any semantic changes
in terms of what the eventual GLSL compiler sees.
So there's a lot happening in this commit, (particularly for such a tiny
feature). But fortunately, the lexer itself is looking cleaner than ever. The
only ugly bit is all the state updating, but it is at least isolated to a
single shared function.
Of course, a new "make check" test is added for the new feature, (directives
with comments and whitespace interleaved in many combinations).
And this commit fixes the following Khronos GLES3 CTS tests:
function_definition_with_comments_vertex
function_definition_with_comments_fragment
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This is in preparation for the planned addition of a new <HASH> start
condition to the lexer. Both start conditions and token types are, of course,
in the same default C namespace, so a start condition and a token type with
the same name will collide. (And unfortunately, they are both apparently
implemented as equivalent numeric types so the collision is undetected at
compile time and simply leads to unpredictable behavior at run time.)
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit does not cause any behavioral change for any valid program. Prior
to entering the <DEFINE> start condition, the only valid start condition is
<INITIAL>, so whether pushing/popping <DEFINE> onto the stack or explicit
returning to <INITIAL> is equivalent.
The reason for this change is that we are planning to soon add a start
condition for <HASH> with the following semantics:
<HASH>: We just saw a directive-introducing '#'
<DEFINE>: We just saw "#define" starting a directive
With these two start conditions in place, the only correct behavior is to
leave <DEFINE> by returning to <INITIAL>. But the old push/pop code would have
returned to the <HASH> start condition which would then cause an error when
the next directive-introducing '#' would be encountered.
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The verbose debug output from the parser is quite useful when debugging, and
having this available as a command-line option is much more convenient than
manually forcing this into the code when needed, (which is what I had been
doing for too long previously).
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For the first line we were initializing the column to 1, but for all
subsequent lines we were initializing the column to 0. The column number is
advanced for each token read before any error message is printed. So the 0
value is the correct initialization, (so that the first column is reported as
column 1).
With this extremely minor change, many of the .expected files are updated such
that error messages for the first line now have the correct column number in
them.
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
| |
It makes more sense to print the directive name with the preceding '#'.
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Here, "skipping" refers to the lexer not emitting any tokens for portions of
the file within an #if condition (or similar) that evaluates to false.
Previously, the lexer had a special <SKIP> start condition used to control
this skipping. This start condition was not handled like a normal start
condition. Instead, there was a particularly ugly block of code set to be
included at the top of the generated lexing loop that would change from
<INITIAL> to <SKIP> or from <SKIP> to <INITIAL> depending on various pieces of
parser state, (such as parser->skip_state and parser->lexing_directive).
Not only was that an ugly approach, but the <SKIP> start condition was
complicating several glcpp bug fixes I attempted recently that want to use
start conditions for other purposes, (such as a new <HASH> start condition).
The recently added RETURN_TOKEN macro gives us a convenient way to implement
skipping without using a lexer start condition. Now, at the top of the
generated lexer, we examine all the necessary parser state and set a new
parser->skipping bit. Then, in RETURN_TOKEN, we examine parser->skipping to
determine whether to actually emit the token or not.
Besides this, there are only a couple of other places where we need to examine
the skipping bit (other than when returning a token):
* To avoid emitting an error for #error if skipped.
* To avoid entering the <DEFINE> start condition for a #define that is
skipped.
With all of this in place in the present commit, there are hopefully no
behavioral changes with this patch, ("make check" still passes all of the
glcpp tests at least).
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
| |
Now that we have a common macro for returning tokens, it makes sense to
perform some of the common work there, (such as copying string values).
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The glcpp parser is line-based, so it needs to see a NEWLINE token at the end
of each line. This causes a trick for files that end without a final newline.
Previously, the lexer for glcpp punted in this case by unconditionally
returning a NEWLINE token at end-of-file, (causing most files to have an extra
blank line at the end). Here, we refine this by lexing end-of-file as a
NEWLINE token only if the immediately preceding token was not a NEWLINE token.
The patch is a minor change that only looks huge for two reasons:
1. Almost all glcpp test result ".expected" files are updated to drop
the extra newline.
2. All return statements from the lexer are adjusted to use a new
RETURN_TOKEN macro that tracks the last-token-was-a-newline state.
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The glcpp implementation has long had code to support a file that ends without
a final newline. But we didn't have a "make check" test for this.
Additionally, the <EOF> action was restricted only to the <INITIAL> state so
it would fail to get invoked if the EOF was encountered in the <COMMENT> or
the <DEFINE> case. Neither of these was a bug, per se, since EOF in either
of these cases is an error anyway, (either "unterminated comment" or
"missing macro name for #define").
But with the new explicit support for these cases, we not generate clean error
messages in these cases, (rather than "unexpected $end" from before).
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The NEWLINE_CATCHUP code is only intended to be invoked after we lex an actual
newline character ('\n'). The two extra calls here were apparently added
accidentally because the pattern happened to contain a (negated) '\n',
(see commit 6005e9cb283214cd57038c7c5e7758ba72ec6ac2).
I don't think either case could have caused any actual bug. (In the first
case, the pattern matched right up to the next newline, so the NEWLINE_CATCHUP
code was just about to be called. In the second case, I don't think it's
possible to actually enter the <SKIP> start condition after commented newlines
without any intervening newline.)
But, if nothing else, the code is cleaner without these extra calls.
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The recent adddition of an error for "#define followed by a non-identifier"
was a bit to aggressive since it used a regular expression in the lexer to
flag any character that's not legal as the first character of an identifier.
But we need to allow comments to appear here, (since we aren't removing
comments in a preliminary pass). So we refine the error here to only flag
characters that could not be an identifier, nor a comment, nor whitespace.
We also augment the existing comment support to be active in the <DEFINE>
state as well.
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, if the preprocessor encountered a #define with a non-identifier,
such as:
#define 123 456
The lexer had no explicit rules to match non-identifiers in the <DEFINE> start
state. Because of this, flex's default rule was being invoked, (printing
characters to stdout), and all text was being discarded by the compiler until
the next identifier. As one can imagine, this led to all sorts of interesting
and surprising results.
Fix this by adding an explicit rule complementing the existing
identifier-based rules that should catch all non-identifiers after #define and
reliably give a well-formatted error message.
A new test is added to "make check" to ensure this bug stays fixed.
This commit also fixes the following Khronos GLES3 CTS test:
define_non_identifier_vertex
(The "fragment" variant was passing earlier only because the preprocessor was
behaving so randomly and causing the compilation to fail. It's lucky, in fact,
that the "vertex" version succesfully compiled so we could find and fix this
bug.)
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
| |
This test simply has one of each directive, all of which are preceded by a
single space character.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The glcpp lexer and parser use the space_tokens state bit to avoid emitting
tokens for spaces while parsing a directive. Previously, this bit was only
being set again by the first non-space token following a directive.
This led to a bug where a space, (or a comment that should emit a space),
immediately following a directive, (optionally searated by newlines), would be
omitted from the output.
Here we fix the bug by also setting the space_tokens bit whenever we lex a
newline in the standard start conditions.
|