summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* nir: rename nir_lower_samplers.c{pp,}Emil Velikov2015-09-212-5/+3
| | | | | | | | | | | | | | | | | | With the only C++ function having its own wrapper we can 'demote' this file to a normal C one. This allows us to get rid of extern C { #include <foo.h> } 'hacks'. Plus some of the headers may use C99 initializers, which are not supported by the ISO standard. This may cause build issue on incremental builds. If so run the following: sed -i -e 's|samplers\.cpp|samplers.c|' src/glsl/nir/.deps/nir_lower_samplers.Plo Fixes: ef8eebc6ad5(nir: support indirect indexing samplers in struct arrays) Signed-off-by: Emil Velikov <[email protected]> Reported-by: Gottfried Haider <[email protected]> Tested-by: Gottfried Haider <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* nir: add C wrapper around glsl_type::record_location_offsetEmil Velikov2015-09-212-0/+9
| | | | | | | | This will allow us to convert nir_lower_sampler.cpp to C. Signed-off-by: Emil Velikov <[email protected]> Tested-by: Gottfried Haider <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* nir: move stdio.h inclusion before extern CEmil Velikov2015-09-211-2/+2
| | | | | | Signed-off-by: Emil Velikov <[email protected]> Tested-by: Gottfried Haider <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* i965: Fix MRF register number assertions for compr4.Kenneth Graunke2015-09-211-2/+2
| | | | | | | | | | compr4 is represented by setting the high bit on the MRF number. We need to mask it out before sanity checking the register number. Fixes ~8000 assert fails on Ironlake and G45. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92066 Signed-off-by: Kenneth Graunke <[email protected]>
* radeonsi: implement TXQS supportIlia Mirkin2015-09-212-25/+69
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Tested-by: Fredrik Bruhn <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: load fmask ptr relative to the resources arrayIlia Mirkin2015-09-211-1/+1
| | | | | | | | | | | res_ptr already contains the resource values. fmask_ptr needs to be looked up relative to the start of the resource params. Note that this only affects indirect loads of MS sampler arrays. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Cc: "11.0" <[email protected]>
* i965/vec4: Use MRF registers 21-23 for spilling in gen6Iago Toral Quiroga2015-09-211-4/+6
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Use MRF registers 21-23 for spilling in gen6Iago Toral Quiroga2015-09-211-4/+7
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Turn BRW_MAX_MRF into a macro that accepts a hardware generationIago Toral Quiroga2015-09-218-28/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are some bug reports about shaders failing to compile in gen6 because MRF 14 is used when we need to spill. For example: https://bugs.freedesktop.org/show_bug.cgi?id=86469 https://bugs.freedesktop.org/show_bug.cgi?id=90631 Discussion in bugzilla pointed to the fact that gen6 might actually have 24 MRF registers available instead of 16, so we could use other MRF registers and avoid these conflicts (we still need to investigate why some shaders need up to MRF 14 anyway, since this is not expected). Notice that the hardware docs are not clear about this fact: SNB PRM Vol4 Part2's "Table 5-4. MRF Registers Available in Device Hardware" says "Number per Thread" - "24 registers" However, SNB PRM Vol4 Part1, 1.6.1 Message Register File (MRF) says: "Normal threads should construct their messages in m1..m15. (...) Regardless of actual hardware implementation, the thread should not assume th at MRF addresses above m15 wrap to legal MRF registers." Therefore experimentation was necessary to evaluate if we had these extra MRF registers available or not. This was tested in gen6 using MRF registers 21..23 for spilling and doing a full piglit run (all.py) forcing spilling of everything on the FS backend. It was also tested by doing spilling of everything on both the FS and the VS backends with a piglit run of shader.py. In both cases no regressions were observed. In fact, many of these tests where helped in the cases where we forced spilling, since that triggered the same underlying problem described in the bug reports. Here are some results using INTEL_DEBUG=spill_fs,spill_vec4 for a shader.py run on gen6 hardware: Using MRFs 13..15 for spilling: crash: 2, fail: 113, pass: 6621, skip: 5461 Using MRFs 21..23 for spilling: crash: 2, fail: 12, pass: 6722, skip: 5461 This patch sets the ground for later patches to implement spilling using MRF registers 21..23 in gen6. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Move MRF register asserts out of brw_reg.hIago Toral Quiroga2015-09-214-7/+16
| | | | | | | | | | | | | | | In a later patch we will make BRW_MAX_MRF return a different value depending on the hardware generation, but it is inconvenient to add a gen parameter to the brw_reg functions only for the assertions, so move these to places where we have the hardware generation available. Ken suggested to add the asserts to brw_set_src0 and brw_set_dest since that would make sure that we catch all uses of MRF registers, even those coming from modules that generate native code directly, like blorp. Unfortunately, this is very late in the process which can make things harder to debug, so add asserts to the generator as well. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Maximum allowed size of SEND messages is 15 (4 bits)Iago Toral Quiroga2015-09-214-2/+10
| | | | | | | | | | | | | | | | | | | Until now we only used MRFs 1..15 for regular SEND messages, so the message length could not possibly exceed the maximum size. Soon we'll allow to use MRF registers 1..23 in gen6, so we need to be careful not to build messages that can go beyond the limit. That could occur, specifically, when building URB write messages, which we may need to split in chunks due to their size. Previously we would simply go and create a new message when we reached MRF 13 (since 13..15 were reserved for spilling), now we also want to check the size of the message explicitly. Besides adding that condition to split URB write messages properly, this patch also adds asserts in the generator. Notice that brw_inst_set_mlen already asserts for this, but asserting in the generators is easy and can make debugging easier in some cases. Reviewed-by: Kenneth Graunke <[email protected]>
* nir/print: fix coverity errorRob Clark2015-09-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Not something actually hit in real life (now state is never non-null, but only case state->syms is null is if nir_print_instr() path). But it was something I overlooked the first time, so might as well fix it. *** CID 1324642: Null pointer dereferences (REVERSE_INULL) /src/glsl/nir/nir_print.c: 299 in print_var_decl() 293 294 fprintf(fp, " (%s, %u)", loc, var->data.driver_location); 295 } 296 297 fprintf(fp, "\n"); 298 >>> CID 1324642: Null pointer dereferences (REVERSE_INULL) >>> Null-checking "state" suggests that it may be null, but it has already been dereferenced on all paths leading to the check. 299 if (state) { 300 _mesa_set_add(state->syms, name); 301 _mesa_hash_table_insert(state->ht, var, name); 302 } 303 } 304 Signed-off-by: Rob Clark <[email protected]>
* i965/vec4/nir: Remove all "this->" snippetsEduardo Lima Mitev2015-09-201-16/+15
| | | | | | | | For consistency, either we have all class members dereferenced, or none. In this case, very few are so lets get rid of them all. Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* dri/common: fix gbm-symbols-check regressionMarcin Ślusarz2015-09-201-1/+1
| | | | | | | | Broken by commit c228514c72cb2fd5fb9e510808e29204fc9e7ae1 "dri/common: use sysconfdir when looking for drirc". Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92054 Signed-off-by: Marcin Ślusarz <[email protected]>
* mesa/teximage: reuse compressed format utility functions for base_formatNanley Chery2015-09-191-145/+5
| | | | | | | | | | | | | | | Reuse utility functions instead of reimplementing the same logic. * _mesa_is_compressed_format() performs the required checking to determine format support in the current context. * _mesa_gl_compressed_format_base_format() returns the base format. As a side effect, we now check that we're in a desktop context when determining support for the FXT1 and RGTC formats. This is in agreement with our extension table and the glext headers. Reviewed-by: Anuj Phogat <[email protected]> Signed-off-by: Nanley Chery <[email protected]>
* mesa/texcompress: add compressed formats to base format utility functionNanley Chery2015-09-191-0/+14
| | | | | | | Add S3TC and PALETTE formats. Reviewed-by: Anuj Phogat <[email protected]> Signed-off-by: Nanley Chery <[email protected]>
* mesa/glformats: refactor compressed format support functionNanley Chery2015-09-191-79/+40
| | | | | | | | | | | Instead of case statements, use _mesa_get_format_layout() to determine if a GL format is part of a family of compressed formats. v2. restrict LATC formats to API_OPENGL_COMPAT (Ilia). rename the variable mFormat to m_format. Reviewed-by: Anuj Phogat <[email protected]> Signed-off-by: Nanley Chery <[email protected]>
* mesa/formats: add MESA_LAYOUT_LATCNanley Chery2015-09-195-16/+7
| | | | | | | | | This enables us to predicate statments on a compressed format being a type of LATC format. Also, remove the comment that lists the enum (it was getting a tad long). Reviewed-by: Anuj Phogat <[email protected]> Signed-off-by: Nanley Chery <[email protected]>
* dri/common: use sysconfdir when looking for drircMarcin Ślusarz2015-09-192-1/+6
| | | | | | | | Useful when locally installed mesa has more quirks than the system one. Signed-off-by: Marcin Ślusarz <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* freedreno/ir3: use nir two-sided-color loweringRob Clark2015-09-181-21/+3
| | | | | | | | With this, we completely switch over to nir lowering passes instead of tgsi_lowering. So one step closer to supporting direct glsl or spirv to nir support for freedreno a3xx/a4xx. Signed-off-by: Rob Clark <[email protected]>
* nir: add two-sided-color lowering passRob Clark2015-09-183-0/+211
| | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nir/build: add nir_vec() helperRob Clark2015-09-183-31/+20
| | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* freedreno/ir3: lower txp/clamp in NIRRob Clark2015-09-181-26/+30
| | | | Signed-off-by: Rob Clark <[email protected]>
* nir/lower_tex: add support to clamp texture coordsRob Clark2015-09-182-1/+103
| | | | | | | | | | | | | | Some hardware needs to clamp texture coordinates to [0.0, 1.0] in the shader to emulate GL_CLAMP. This is added to lower_tex_proj since, in the case of projected coords, the clamping needs to happen *after* projection. v2: comments/suggestions from Ilia and Eric, use txs to get texture size and clamp RECT textures to their dimensions rather than [0.0, 1.0] to avoid having to lower RECT textures to 2D. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir/lower_tex: support for lowering RECT texturesRob Clark2015-09-182-3/+63
| | | | | | | | v2: comments/suggestions from Ilia and Eric, split out get_texture_size() helper so we can use it in the next commit for clamping RECT textures. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir/lower_tex: support projector lowering per sampler typeRob Clark2015-09-183-10/+34
| | | | | | | | | | | | Some hardware, such as adreno a3xx, supports txp on some but not all sampler types. In this case we want more fine grained control over which texture projectors get lowered. v2: split out nir_lower_tex_options struct to make it easier to add the additional parameters coming in the following patches Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir/lower_tex: split out project_src() helperRob Clark2015-09-181-69/+77
| | | | | | | Split this out to reduce noise in later patches. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: rename nir_lower_tex_projectorRob Clark2015-09-184-8/+8
| | | | | | | | | | Since the following patches will add additional tex-lowering related functionality, which doesn't make sense to split out into a separate pass (as they would require duplication of the projector lowering logic), let's give this pass a more generic name. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/vec4: Change types as needed to propagate source modifiers using ↵Alejandro Piñeiro2015-09-191-2/+28
| | | | | | | | | | | | | | | | | | | | | | current instruction SEL and MOV instructions, as long as they don't have source modifiers, are just copying bits around. So those kind of instruction could be propagated even if there are type mismatches. This is needed because NIR generates integer SEL and MOV instructions whenever it doesn't know what else to generate. This commit adds support for copy propagation using current instruction as reference. Equivalent to commit 472ef9 but for vec4. v2: include check for saturate, as Jason Ekstrand suggested v3: check that the dst.type and the src type are the same, in order to solve (among others) the following deqp regression with v2: dEQP-GLES3.functional.shaders.operator.unary_operator.minus.lowp_uint_vertex Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Fix comparison between signed and unsigned integer expressionsIago Toral Quiroga2015-09-181-2/+2
| | | | | | | brw_fs_visitor.cpp: In member function 'void fs_visitor::emit_urb_writes()': brw_fs_visitor.cpp:977:58: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] Reviewed-by: Tapani Pälli <[email protected]>
* mesa: fix errors when reading depth with glReadPixelsTapani Pälli2015-09-182-1/+7
| | | | | | | | | | | | | | | | | | | | | | | OpenGL ES 3.0 spec 3.7.2 "Transfer of Pixel Rectangles" specifies DEPTH_COMPONENT, UNSIGNED_INT as a valid couple, validation for internal format is checked by is_float_depth(). Fix regression caused by 81d2fd91a90e5b2fd9fd74792a7a7c329f0e4d29 in: ES3-CTS.gtf.GL3Tests.packed_pixels.packed_pixels Test uses GL_DEPTH_COMPONENT, UNSIGNED_INT only when GL_NV_read_depth extension is present. v2: change check in _mesa_error_check_format_and_type to be explicit for ES 2.0+, desktop OpenGL does not allow this behaviour + uses this function for both glReadPixels and glDrawPixels validation. (No Piglit regressions seen with v2.) Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> [v1] Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92009 Cc: "10.6 11.0" <[email protected]>
* nir/builder: fix c++11 compiler warningRob Clark2015-09-171-1/+1
| | | | | | | | | | | Fixes: In file included from nir/nir_lower_samplers.cpp:27:0: nir/nir_builder.h: In function 'nir_ssa_def* nir_channel(nir_builder*, nir_ssa_def*, int)': nir/nir_builder.h:222:37: warning: narrowing conversion of 'c' from 'int' to 'unsigned int' inside { } is ill-formed in C++11 [-Wnarrowing] unsigned swizzle[4] = {c, c, c, c}; Signed-off-by: Rob Clark <[email protected]>
* nir: really actually fix comment this timeRob Clark2015-09-171-1/+1
| | | | Signed-off-by: Rob Clark <[email protected]>
* nir/print: print variable namesRob Clark2015-09-171-0/+30
| | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* nir: some comment fixupsRob Clark2015-09-171-5/+5
| | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* freedreno/ir3: add --gpu arg to cmdline compilerRob Clark2015-09-171-1/+10
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a4xx: wire up ucp supportRob Clark2015-09-171-0/+1
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: add support for ucpRob Clark2015-09-174-13/+80
| | | | | | | | | Use nir_lower_clip pass for adding the VS/FS instructions to handle user-clip-planes and CLIPDIST. Wire up support for load_user_clip_plane intrinsic to fetch ucp[plane] values as driver-params (passed as const's to the shader). Signed-off-by: Rob Clark <[email protected]>
* nir: add lowering stage for user-clip-planes / clipdistRob Clark2015-09-173-0/+344
| | | | | | | | | | | | | | | The vertex shader lowering adds calculation for CLIPDIST, if needed (ie. user-clip-planes), and the frag shader lowering adds conditional kills based on CLIPDIST value (which should be treated as a normal interpolated varying by the driver). Note that this won't quite do the right thing in the face of MSAA plus user-clip-planes, since all the samples would be killed or not (rather than potentially only a portion of them). But it's better than no UCP support at all for drivers that don't have this in hw. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nir: add sysval for user-clip-planesRob Clark2015-09-171-13/+14
| | | | | | | | For lowering user-clip-planes, we need a way to pass the enabled/used user-clip-planes in to shader. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* freedreno/ir3: convert from tgsi semantic/index to varying-slotRob Clark2015-09-177-193/+234
| | | | Signed-off-by: Rob Clark <[email protected]>
* glsl: add SYSTEM_VALUE_VERTEX_CNTRob Clark2015-09-172-0/+7
| | | | | | | | | Used internally in freedreno/ir3 to calc stream-out position. Seems like a generic enough way to implement stream-out (using str instrs), plus it avoids compiler warnings by sneaking in a non-enum value in switch statements. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: switch to shader_enums.h interp constantsRob Clark2015-09-174-41/+20
| | | | | | A small step towards un-TGSI'ifying ir3. Signed-off-by: Rob Clark <[email protected]>
* nv50,nvc0: flush texture cache in presence of coherent bufsIlia Mirkin2015-09-172-0/+39
| | | | | | | | This fixes the newly-added arb_texture_buffer_object-bufferstorage piglit test. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.0" <[email protected]>
* nv50,nvc0: detect underlying resource changes and update ticIlia Mirkin2015-09-172-0/+43
| | | | | | | | | | | | | When updating texture buffers, we might end up replacing the whole buffer. Check that the tic address matches the resource address, and if not, update the tic and reupload it. This fixes: arb_direct_state_access-texture-buffer arb_texture_buffer_object-data-sync Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.0" <[email protected]>
* vc4: Try to pair up instructions when only one of them has PM bitBoyan Ding2015-09-171-47/+76
| | | | | | | | | | | | | Instructions with difference in PM field can actually be paired up if the one without PM doesn't do packing/unpacking and non-NOP packing/unpacking operations from PM instruction aren't added to the other without PM. total instructions in shared programs: 48209 -> 47460 (-1.55%) instructions in affected programs: 11688 -> 10939 (-6.41%) Signed-off-by: Boyan Ding <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/vec4: Use nir_move_vec_src_uses_to_destJason Ekstrand2015-09-171-0/+3
| | | | | | | | | | | | | | | | | | | The idea here is not that it gives register coalescing a little bit of a helping hand. It doesn't actually fix the coalescing problems, but it seems to help a good bit. Shader-db results for vec4 programs on Haswell: total instructions in shared programs: 1746280 -> 1683959 (-3.57%) instructions in affected programs: 1259166 -> 1196845 (-4.95%) helped: 11363 HURT: 148 v2 (Jason Ekstrand): - Run nir_move_vec_src_uses_to_dest after going out of SSA - New shader-db numbers Reviewed-by: Eduardo Lima Mitev <[email protected]>
* nir: Add a pass to rewrite uses of vecN sources to the vecN destinationJason Ekstrand2015-09-173-0/+199
| | | | | | | v2 (Jason Ekstrand): - Handle non-SSA sources and destinations Reviewed-by: Eduardo Lima Mitev <[email protected]>
* nir: Add comments to nir_index_instrs and nir_index_ssa_defsJason Ekstrand2015-09-171-0/+8
| | | | | | | The provided indices have the very nice property that if A dominates B then A->index <= B->index. We should document that somewhere. Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Add a generic instruction indexJason Ekstrand2015-09-172-0/+22
| | | | Reviewed-by: Kenneth Graunke <[email protected]>