summaryrefslogtreecommitdiffstats
path: root/src/compiler
Commit message (Collapse)AuthorAgeFilesLines
* nir/serialize: reuse the writemask field for 2 src X swizzles of SSA ALUMarek Olšák2019-11-231-3/+33
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir/serialize: remove up to 3 consecutive equal ALU instruction headersMarek Olšák2019-11-231-16/+65
| | | | | | | | | | | vec4 scalarized ALUs typically have 4 equal instruction headers, so remove the last 3. There are no bits left in the ALU header for more flags, so future extensions of NIR will have to use something like instr_type == 15 to describe more complex ALU instructions. Reviewed-by: Connor Abbott <[email protected]>
* nir/serialize: try to pack both deref array src into 32 bitsMarek Olšák2019-11-231-5/+28
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir/serialize: cleanup - fold nir_deref_type_var cases into switchesMarek Olšák2019-11-231-16/+19
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir/serialize: try to put deref->var index into the unused bits of the headerMarek Olšák2019-11-231-10/+23
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir/serialize: don't serialize mode for deref non-cast instructionsMarek Olšák2019-11-231-5/+12
| | | | | | | | | | It can be derived from src and var. This frees 10 bits in the header that will be used later. "mode" is moved in the structure, because those bits will be used for something else later. Reviewed-by: Connor Abbott <[email protected]>
* nir/serialize: don't store deref types if not neededMarek Olšák2019-11-231-4/+26
| | | | | | | - type_cast: deduplicate types if the last one is the same - derive the type from the parent for other derefs Reviewed-by: Connor Abbott <[email protected]>
* nir/serialize: try to pack two alu srcs into 1 uint32Marek Olšák2019-11-231-21/+76
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir/serialize: pack nir_intrinsic_instr::const_index[] betterMarek Olšák2019-11-231-5/+84
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir/serialize: pack 1-component constants into 20 bits if possibleMarek Olšák2019-11-231-37/+135
| | | | | | | | | | The majority of constants can be packed like this. v2: - use enum for the packing encoding, - trim packed_value to 20 bits add 1 bit to last_component, which simplifies a later commit Reviewed-by: Connor Abbott <[email protected]>
* nir/serialize: pack load_const with non-64-bit constants betterMarek Olšák2019-11-231-2/+46
| | | | | | | v2: use blob_write_uint8/16 Reviewed-by: Jason Ekstrand <[email protected]> (v1) Reviewed-by: Connor Abbott <[email protected]>
* nir/serialize: try to store a diff in var data locations instead of var dataMarek Olšák2019-11-231-15/+73
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir/serialize: deduplicate serialized var types by reusing the last unique oneMarek Olšák2019-11-231-10/+39
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir/serialize: don't serialize var->data for temporariesMarek Olšák2019-11-231-12/+37
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir/serialize: pack src better and limit the object count to 1M from 1GMarek Olšák2019-11-231-33/+75
| | | | | | | We need to limit the object count to 1M to free 10 bits for the src modifiers. Reviewed-by: Connor Abbott <[email protected]>
* nir/serialize: pack instructions betterMarek Olšák2019-11-231-106/+297
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir/range_analysis: Make sure the table validation only occurs onceIan Romanick2019-11-221-38/+58
| | | | | | | | | | | | | All of the tables are static const, so they only need to be validated once. As noted in the previous commit, the compiler should be able to eliminate all of this code when the assertions would pass. Even with the help of the previous commit, this does not always occur. -Og: -95.688 +/- 3.91935 (-24.9562% +/- 1.0222%) N=5 -O1: No difference proven at 95.0% confidence. N=5 -O2: -1.962 +/- 0.85001 (-0.860013% +/- 0.372589%) N=5 Reviewed-by: Eric Anholt <[email protected]>
* nir/range-analysis: Add pragmas to help loop unrollingIan Romanick2019-11-221-0/+10
| | | | | | | | | | | | | | | | | | | | I was pretty liberal with these assertions when I wrote this code because I had assumed that GCC would unroll the loops, inline the look ups of static const arrays with now constant indices, and then elmininate all the actuall assertions. It seems none of this happens even at -O3. Adding the pragmas helps encourage loop unrolling at some optimization levels. I tested by running shader-db with NIR_VALIDATE=false on a Core i7 Haswell desktop system. -Og: No difference proven at 95.0% confidence. N=5 -O1: -48.304 +/- 1.221 (-16.3343% +/- 0.412888%) N=5 -O2: -49.94 +/- 1.23521 (-17.9634% +/- 0.444303%) N=5 v2: Add a _Pragma to an inner loop that was accidentally dropped during a rebase. Reviewed-by: Eric Anholt <[email protected]>
* glsl: Add varyings to "zero-init of uninitialized vars" workaroundDanylo Piliaiev2019-11-221-1/+2
| | | | | | | | | | Varyings are similar to already handled cases. And "glsl_zero_init" name of the workaround already looks like it should include varyings. The issue was observed in GiMark subtest from GpuTest. Signed-off-by: Danylo Piliaiev <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nir: Add load_sampler_lod_paramaters_pan intrinsicAlyssa Rosenzweig2019-11-221-0/+4
| | | | | | | | | This loads in the <min_lod, max_lod, lod_bias> settings for a given sampler, which is necessary for lowering clamps/biases on certain Midgard chips. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]>
* nir/serialize: do ctx = {0} instead of manual initializationsMarek Olšák2019-11-211-4/+2
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir: strip as we serialize to remove the nir_shader_clone callMarek Olšák2019-11-215-134/+34
| | | | | | Serializing stripped NIR is faster now. Reviewed-by: Connor Abbott <[email protected]>
* nir: fix deref offset builderDave Airlie2019-11-221-1/+1
| | | | | | Use the correct bit size Reviewed-by: Jason Ekstrand <[email protected]>
* vtn/opencl: add clz supportDave Airlie2019-11-222-0/+10
| | | | | | This is needed for OpenCL Reviewed-by: Jason Ekstrand <[email protected]>
* nir: add 64-bit ufind_msb lowering support. (v2)Dave Airlie2019-11-222-0/+24
| | | | | | | | This adds the option to lower 64-bit ufind_msb opcodes. v2: use split_x/y removes component loops (Jason) Reviewed-by: Jason Ekstrand <[email protected]>
* spirv/nir/opencl: handle some multiply instructions.Dave Airlie2019-11-222-0/+55
| | | | | | | This adds support for some missing 24-bit and hi multiply variants. Reviewed-by: Jason Ekstrand <[email protected]>
* spirv: get the correct type for function returns.Dave Airlie2019-11-221-1/+4
| | | | | | | | This needs to be derived from the address format, not always 1/32. Suggested by Jason Reviewed-by: Jason Ekstrand <[email protected]>
* spirv: don't store 0 to cs.ptr_size for non kernel stages.Dave Airlie2019-11-221-1/+0
| | | | | | cs is a union so storing this there is wrong. Reviewed-by: Jason Ekstrand <[email protected]>
* glsl: add missing initialization of the location path fieldIago Toral Quiroga2019-11-211-0/+2
| | | | | | | | | | | | | | This was apparently missed in 67b32190f3c95, which added support for ARB_shading_language_include to #line, including the 'path' field for the location. Fixes crashes in CTS with all drivers as they attempt to access an uninitialized path string during parsing. Fixes: 67b32190f3c95 ("glsl: add ARB_shading_language_include support to #line") Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2132 Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Jose Maria Casanova <[email protected]>
* compiler: move build definition of pp_standalone_scaffolding.cTimothy Arceri2019-11-212-2/+3
| | | | | | | | | This should fix android build issues while still allowing scons to build the standalone compiler. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2129 Reviewed-by: Mark Janes <[email protected]>
* nir/validate: validate num_components on registers and intrinsicsKarol Herbst2019-11-211-8/+16
| | | | | | | | | | | also make 8 and 16 compoments invalid. We will enable that later again when we actually support it. v2: fix validation of nir_intrinsic_instr::num_components correct validation of instr->num_components Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir/large_constants: use nir_index_vars and nir_variable::indexRhys Perry2019-11-201-12/+8
| | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* nir: add nir_variable::index and nir_index_varsRhys Perry2019-11-202-0/+41
| | | | | | | | | This will be useful as a deterministic identifier/index for the variable. v2: fix comment style Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Connor Abbott <[email protected]> (v1)
* nir: make nir_variable::{num_members,num_state_slots} a uint16_tRhys Perry2019-11-201-2/+2
| | | | | | | Doesn't shrink it (at least, on x86-64) and leaves space for more members. Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* nir/lower_alu_to_scalar: Support lowering 8- and 16-bit reduce opsNeil Roberts2019-11-201-0/+8
| | | | | Reviewed-by: Rob Clark <[email protected]> Acked-by: Alyssa Rosenzweig <[email protected]>
* nir: Add a 8-bit bool typeNeil Roberts2019-11-202-2/+12
| | | | | | | | Adds nir_type_bool8 as well as 8-bit versions of all the bool opcodes. Reviewed-by: Rob Clark <[email protected]> Acked-by: Alyssa Rosenzweig <[email protected]>
* nir: Add a 16-bit bool typeNeil Roberts2019-11-202-1/+11
| | | | | | | | Adds nir_type_bool16 as well as 16-bit versions of all the bool opcodes. Reviewed-by: Rob Clark <[email protected]> Acked-by: Alyssa Rosenzweig <[email protected]>
* nir/opcodes: Add a helper function to generate reduce opcodesNeil Roberts2019-11-201-17/+15
| | | | | | | | | Adds binop_reduce_all_sizes which generates both 1-bit and 32-bit versions of the reduce operation. This reduces the code duplication a bit and will make it easier to later add 16-bit versions as well. Reviewed-by: Rob Clark <[email protected]> Acked-by: Alyssa Rosenzweig <[email protected]>
* nir/opcodes: Add a helper function to generate the comparison binopsNeil Roberts2019-11-201-20/+14
| | | | | | | | | | Adds binop_compare_all_sizes which generates both 1-bit and 32-bit versions of the comparison operation. This reduces the code duplication a bit and will make it easier to later add 16-bit versions as well. Reviewed-by: Rob Clark <[email protected]> Acked-by: Alyssa Rosenzweig <[email protected]>
* mesa: add support cursor support for relative path shader includesTimothy Arceri2019-11-203-1/+36
| | | | | | | | | | | | | This will allow us to continue searching the current path for relative shader includes. From the ARB_shading_language_include spec: "If it is quoted with double quotes in a previously included string, then the first search point will be the tree location where the previously included string had been found." Reviewed-by: Witold Baryluk <[email protected]>
* glsl: delay compilation skip if shader contains an includeTimothy Arceri2019-11-201-6/+40
| | | | | | | | If the shader contains an include when need to first run the preprocessor before deciding if we can skip compilation based on the shader cache. Reviewed-by: Witold Baryluk <[email protected]>
* glsl: add can_skip_compile() helperTimothy Arceri2019-11-201-10/+20
| | | | | | We will reuse this in the following commit. Reviewed-by: Witold Baryluk <[email protected]>
* glsl: error if #include used while extension is disabledTimothy Arceri2019-11-202-0/+15
| | | | | | In other words make sure the shader does this: Reviewed-by: Witold Baryluk <[email protected]>
* glsl: add preprocessor #include supportTimothy Arceri2019-11-207-4/+194
| | | | Reviewed-by: Witold Baryluk <[email protected]>
* glsl: pass gl_context to glcpp_parser_create()Timothy Arceri2019-11-203-7/+7
| | | | | | This is a small tidy up and will be useful in the following commit. Reviewed-by: Witold Baryluk <[email protected]>
* glsl: add ARB_shading_language_include support to #lineTimothy Arceri2019-11-207-8/+80
| | | | | | | | | | | | | | | | | | | | | | From the ARB_shading_language_include spec: "#line must have, after macro substitution, one of the following forms: #line <line> #line <line> <source-string-number> #line <line> "<path>" where <line> and <source-string-number> are constant integer expressions and <path> is a valid string for a path supplied in the #include directive. After processing this directive (including its new-line), the implementation will behave as if it is compiling at line number <line> and source string number <source-string-number> or <path> path. Subsequent source strings will be numbered sequentially, until another #line directive overrides that numbering." Reviewed-by: Witold Baryluk <[email protected]>
* glsl: add infrastructure for ARB_shading_language_includeTimothy Arceri2019-11-202-0/+3
| | | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]> Reviewed-by: Witold Baryluk <[email protected]>
* nir: don't use GLenum16 in nir.hMarek Olšák2019-11-191-2/+1
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir: move data.descriptor_set above data.index for better packingMarek Olšák2019-11-191-4/+4
| | | | | | 4 bytes down Reviewed-by: Connor Abbott <[email protected]>
* glsl_to_nir: rename image_access to mem_accessMarek Olšák2019-11-191-12/+12
| | | | Reviewed-by: Connor Abbott <[email protected]>