summaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers/dri/i965
Commit message (Collapse)AuthorAgeFilesLines
* i965: Move SOL binding #defines to brw_compiler.hJason Ekstrand2017-03-015-30/+33
| | | | | | | | While we're at it, we also change the GEN6 binding macro to be a start index that gets added to the binding. This makes things a bit more explicit. Reviewed-by: Kenneth Graunke <[email protected]>
* i964/gs: Move MAX_GS_INPUT_VERTICES to brw_vec4_gs_visitor.hJason Ekstrand2017-03-012-2/+2
| | | | | | It's only users are in brw_vec4_gs_visitor and gen6_vec4_gs_visitor. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/gs: Add the gl_prim_to_hw_prim table to vec4_gs_visitor.cppJason Ekstrand2017-03-011-1/+19
| | | | | | | | | It's currently in brw_util.c but that's the only bit of brw_util.c that's shared between the compiler and the rest of the GL driver. It's just a fairly obvious table so the duplication isn't bad. It's certainly less pain than trying to figure out how to share the code. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Don't use MAX_SURFACES in mark_surface_usedJason Ekstrand2017-03-011-1/+4
| | | | | | | Vulkan doesn't respect MAX_SURFACES so this assert isn't valid in that case. It should, however, assert that it isn't insanely large. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Get rid of BRW_PRIM_OFFSETJason Ekstrand2017-03-012-14/+2
| | | | | | | This is a relic of when we wired up meta to be able to use RECTLIST primitives. It's no longer needed. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/vue_map: Stop using GLbitfield typesJason Ekstrand2017-03-012-9/+9
| | | | | | Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* i965: Move assign_common_binding_table_offsets to brw_programJason Ekstrand2017-03-014-93/+94
| | | | | | | | | | This isn't used by Vulkan and is specific to the way the GL driver works. There's no reason to have it in common compiler code. Also, it relies on BRW_MAX_* defines which are defined in brw_context.h Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* i965: Move some gen4 WM defines to brw_compiler.hJason Ekstrand2017-03-014-42/+46
| | | | | | | | These go in wm_prog_key so they're part of the compiler interface. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* i965: Move brw_disassemble_inst to brw_eu.hJason Ekstrand2017-03-012-4/+2
| | | | | | Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* i965: Move some helpers from brw_context.h to brw_shader.hJason Ekstrand2017-03-013-16/+18
| | | | | | Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* i965: Move a couple of #defines from brw_context to brw_compilerJason Ekstrand2017-03-012-18/+16
| | | | | | Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* i965: Drop unused STATE_TEXRECT_SCALE code.Kenneth Graunke2017-03-013-27/+0
| | | | | | | | | | | | | | In the past, we used this on Gen4-5 to transform non-normalized texture coordinates (for sampler2DRect) to normalized ones. We also used it on Gen6-7.5 for sampler2DRect with GL_CLAMP. Jason dropped this code in 6c8ba59cff14a1a86273f4008ff2a8e68335ab25 in favor of using nir_lower_tex(), which just does a textureSize() call. But we were still setting up these state references for useless uniform data. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: emit MOV_INDIRECT with the source with the right register typeSamuel Iglesias Gonsálvez2017-03-011-1/+1
| | | | | | | | This was hiding bugs as it retyped the source to destination's type. Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Cc: "17.0" <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* i965/fs: fix source type when emitting MOV_INDIRECT to read ICP handlesSamuel Iglesias Gonsálvez2017-03-011-3/+3
| | | | | | | | | | | | | | When generating the MOV INDIRECT instruction, the source type is ignored and it is set to destination's type. However, this is going to change in a later patch, so we need to explicitly set the proper source type. brw_vec8_grf() creates an float type's fs_reg by default, when the ICP handle is actually unsigned. This patch fixes these cases before applying the aforementioned patch. Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Cc: "17.0" <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* i965/fs: fix indirect load DF uniforms on BSW/BXTSamuel Iglesias Gonsálvez2017-03-011-21/+20
| | | | | | | | | | | | | | | | | | | | | The lowered BSW/BXT indirect move instructions had incorrect source types, which luckily wasn't causing incorrect assembly to be generated due to the bug fixed in the next patch, but would have confused the remaining back-end IR infrastructure due to the mismatch between the IR source types and the emitted machine code. v2: - Improve commit log (Curro) - Fix read_size (Curro) - Fix DF uniform array detection in assign_constant_locations() when it is acceded with 32-bit MOV_INDIRECTs in BSW/BXT. v3: - Move changes in assign_constant_locations() to other patch. Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Cc: "17.0" <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* i965/fs: detect different bit size accesses to uniforms to push them in ↵Samuel Iglesias Gonsálvez2017-03-011-16/+34
| | | | | | | | | | | | | | proper locations Previously, if we had accesses with different sizes to the same uniform, we might not push it aligned with the bigger one. This is a problem in BSW/BXT when we access an array of DF uniform with both direct and indirect addressing because for the latter we use 32-bit MOV INDIRECT instructions. However this problem can happen with other generations and bitsizes. Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Cc: "17.0" <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* i965/fs: mark last DF uniform array element as 64 bit live oneSamuel Iglesias Gonsálvez2017-03-011-0/+3
| | | | | | | | | This bug can make that we don't detect the end of a contiguous area correctly and push larger areas than the real ones. Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Cc: "17.0" <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* i965: Move intel_resolve_map.[ch] from i965_compiler_FILES to i965_FILESKenneth Graunke2017-02-271-3/+3
| | | | | | | | I have no idea why these were part of the compiler files. They're miptree related code, and the compiler doesn't appear to use them. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* main/performance_query: s/GLboolean/bool/Robert Bragg2017-02-241-2/+2
| | | | | | | | | Ideally would have caught these when adding the interface but this just switches a few return types for the INTEL_performance_query backend interface to bool instead of GLboolean. Signed-off-by: Robert Bragg <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* i965: Implement INTEL_performance_query backendRobert Bragg2017-02-226-0/+729
| | | | | | | | | | | | | | | | | | | This adds a bare-bones backend for the INTEL_performance_query extension that exposes pipeline statistics. Although this could be considered redundant given that the same statistics are already available via query objects, they are a simple starting point for this extension and it's expected to be convenient for tools wanting to have a single go to api to introspect what performance counters are available, along with names, descriptions and semantic/data types. This code is derived from Kenneth Graunke's work, temporarily removed while the frontend and backend interface were reworked. Signed-off-by: Robert Bragg <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/gen6+: Enable arb_transform_feedback_overflow_query.Rafael Antognolli2017-02-211-0/+1
| | | | | | | | | | | | This extension adds new query types which can be used to detect overflow of transform feedback buffers. The new query types are also accepted by conditional rendering commands. v3: - s/gen7+/gen6+/ in the relnotes (Jordan Justen) Signed-off-by: Rafael Antognolli <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Add support for xfb overflow query on conditional render.Rafael Antognolli2017-02-211-14/+53
| | | | | | | | | | | | | | | | | | | Enable the use of a transform feedback overflow query with glBeginConditionalRender. The render commands will only execute if the query is true (i.e. if there was an overflow). Use ARB_conditional_render_inverted to change this behavior. v4: - reuse MI_MATH calcs from hsw_queryob (Kenneth) - fallback to software conditional rendering when MI_MATH is not available (Kenneth) v5: - check query->Target (Kenneth) Signed-off-by: Rafael Antognolli <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Add support for xfb overflow on query buffer objects.Rafael Antognolli2017-02-212-0/+115
| | | | | | | | | | | | | | | Enable getting the results of a transform feedback overflow query with a buffer object. v4: - hsw_overflow_result_to_gpr0 a public function, so it can be used by conditional render. (Kenneth) - fix typo grp0/gpr0 (Kenneth) - rename load_gen_written_data_to_regs to load_overflow_data_to_cs_gprs (Kenneth) Signed-off-by: Rafael Antognolli <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: add plumbing for ARB_transform_feedback_overflow_query.Rafael Antognolli2017-02-212-0/+75
| | | | | | | | | | | | | When querying for transform feedback overflow on one or all of the streams, store information about number of generated and written primitives. Then check whether generated == written. v2: - use only SO_PRIM_STORAGE_NEEDED, do not fallback to CL_INVOCATION_COUNT. (Kenneth) Signed-off-by: Rafael Antognolli <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Enable ARB_transform_feedback2 on Sandybridge.Kenneth Graunke2017-02-212-0/+5
| | | | | | | | | | | | | | | | | | | | | | The only feature over and above ES 3.0 is DrawTransformFeedback(). We already have to do the whole SOL_NUM_PRIMS_WRITTEN counter dance in order to compute the SVBI value for ResumeTransformFeedback(), at which point our existing GetTransformFeedbackVertexCount() implementation will do the trick (though with a stall to CPU map the buffer). Someday, we could probably implement DrawTransformFeedback() more efficiently, using the "Load Internal Vertex Count" feature of 3DSTATE_SVB_INDEX and the 3DPRIMITIVE indirect vertex count bit. Rumor has it this allows people to use WebGL 2.0 on Sandybridge. Note that we don't need pipelined register writes like Gen7+ because we use the 3DSTATE_SVB_INDEX command rather than MI_LOAD_REGISTER_MEM. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99842 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965: Properly reset SVBI counters on ResumeTransformFeedback().Kenneth Graunke2017-02-213-17/+107
| | | | | | | | | | | | | | | | | | | | | | | This fixes Piglit's ARB_transform_feedback2/change-objects-while-paused GLES 3.0 test. When resuming the transform feedback object, we need to reset the SVBI counters so we continue writing at the correct point in the buffer. Instead of SO_WRITE_OFFSET counters (with a DWord offset), we have the Streamed Vertex Buffer Index (SVBI) counters, which contain a count of vertices emitted. Unfortunately, there's no straightforward way to store the current SVBI counter values to a buffer. They're not available in a register. You can use a bit in the 3DSTATE_SVB_INDEX packet to copy them to another internal counter which 3DPRIMITIVE can use...but there's no good way to extract that either. So, once again, we use SO_NUM_PRIMS_WRITTEN to calculate the vertex numbers. Thankfully, we can reuse most of the existing Gen7+ code. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965: Save max_index in brw_transform_feedback_object.Kenneth Graunke2017-02-212-2/+10
| | | | | | | I'm going to need this in a new Resume hook shortly. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965: Update brw_save_primitives_written_counters for pre-Gen7.Kenneth Graunke2017-02-211-4/+10
| | | | | | | Sandybridge and earlier only have a single counter. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965: Use ctx->Const.MaxVertexStreams rather than BRW_XFB_MAX_STREAMS.Kenneth Graunke2017-02-211-9/+16
| | | | | | | | This way on Sandybridge we'll only do 1 stream worth of math, since we only have one SO_NUM_PRIMS_WRITTEN counter. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965: Move some code from gen7_sol_state.c to gen6_sol.c.Kenneth Graunke2017-02-213-144/+150
| | | | | | | | | I plan to use these functions on Sandybridge soon. I changed the prefix on a couple of functions to "brw" instead of "gen7" as in theory they should be usable all the way back to G45. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965: Drop dead Gen8+ code from Gen7/sometimes-HSW driver hooks.Kenneth Graunke2017-02-211-26/+24
| | | | | | | | These driver hooks are not used when MI_MATH and MI_LOAD_REGISTER_REG are supported, which Gen8+ can always do. So this code is dead. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* intel/blorp: Explicitly flush all allocated stateJason Ekstrand2017-02-211-0/+8
| | | | | | | | Found by inspection. However, I expect it fixes real bugs when using blorp from Vulkan on little-core platforms. Reviewed-by: Lionel Landwerlin <[email protected]> Cc: "13.0 17.0" <[email protected]>
* i965: remove 'virtual' and extern C workaroundsEmil Velikov2017-02-211-13/+3
| | | | | | | The headers are properly annotated thus we don't need these. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* i965: add extern C notation in headersEmil Velikov2017-02-213-0/+22
| | | | | | | | | | | Otherwise symbols wont be annotated with C linkage and we'll fail at link time. Currently this is worked around by wrapping the header inclusion itself. The latter in itself fragile and not recommended. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* i965/fs: fix uninitialized memory accessLionel Landwerlin2017-02-171-3/+2
| | | | | | | | Found while running shader-db under valgrind. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Cc: "13.0 17.0" <[email protected]>
* i965/fs: fix 32-bit data type to int64 conversion on BSW/BXTSamuel Iglesias Gonsálvez2017-02-171-7/+7
| | | | | | | | | | | The 32-bit to 64-bit conversions need to have the 32-bit data source elements aligned to 64-bit but only with doubles as destination type. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99660 Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Tested-by: Mark Janes <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/fs: Remove hand-coded 64-bit packing optimizationsJason Ekstrand2017-02-161-50/+0
| | | | | | | | | | The optimization in unpack_64 is clearly subsumed with the opt_algebraic optimizations in the previous commit. The pack optimization may not be quite handled by opt_algebraic but opt_algebraic should get the really bad cases. Also, it's been broken since it was merged and we've never noticed so it must not be doing anything. Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Rename lower_double_pack to lower_64bit_packJason Ekstrand2017-02-161-1/+1
| | | | | | | There's nothing "double" about it other than, perhaps, the fact that it packs two 32-bit values. Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Combine the int and double [un]pack opcodesJason Ekstrand2017-02-162-26/+13
| | | | | | | NIR is a typeless IR and the two opcodes, when considered bitwise, do exactly the same thing. There's no reason to have two versions. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Fix the inline nir_op_pack_double optimizationJason Ekstrand2017-02-161-1/+1
| | | | | | | We can only do the optimization if the source *is* SSA. Reviewed-by: Kenneth Graunke <[email protected]> Cc: "13.0 17.0" <[email protected]>
* i965: Do not use purged bo after calling glObjectUnpurgeableChris Wilson2017-02-151-9/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | If the buffer has been freed by the kernel under memory pressure, it is invalid to try and access the backing storage for that buffer in the future - the backing storage is not recreated automatically. As such we need to mark the GL object as being freed for unretained buffers and so recreate the object on next use. Futhermore from the GL_APPLE_object_purgeable: "In contrast, by calling ObjectUnpurgeableAPPLE with an <option> of UNDEFINED_APPLE, the application is indicating that it intends to recreate the contents of the storage from scratch. Further, the application is is stating that it would like the GL to do only the minimal amount of work set PURGEABLE_APPLE to FALSE. If ObjectUnpurgeableAPPLE is called with the <option> set to UNDEFINED_APPLE, then ObjectUnpurgeableAPPLE will return the value UNDEFINED_APPLE." we must always report GL_UNDEFINED_APPLE when called with glObjectUnpurgeable(GL_UNDEFINED_APPLE). Testcase: piglit/object_purgeable-api-* Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: define default allow_higher_compat_version valueLionel Landwerlin2017-02-151-0/+1
| | | | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Matt Turner <[email protected]> Fixes: 9d16f3903e2 ("driconf: add allow_higher_compat_version option")
* driconf: add allow_higher_compat_version optionSamuel Pitoiset2017-02-151-0/+3
| | | | | | | | | | | | | | | | | | | Mesa currently doesn't allow to create 3.1+ compatibility profiles mainly because various features are unimplemented and bugs can happen. However, some buggy apps request a compat profile without using any old features unimplemented in mesa, and they fail to start. This option should help some games to run but it's not enough for all (eg. Dying Light). v2: - s/force_compat_profile/allow_higher_compat_version Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Edmondo Tommasina <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* i965/sampler_state: Set the "Base Mip Level" field on Sandy BridgeJason Ekstrand2017-02-122-1/+20
| | | | | | | | | | | Fixes two GL ES 3.0 CTS tests on Sandy Bridge: ES3-CTS.functional.texture.mipmap.cube.base_level.linear_linear ES3-CTS.functional.texture.mipmap.cube.base_level.linear_nearest Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "17.0 13.0" <[email protected]>
* i965/sampler_state: Pass texObj into update_sampler_stateJason Ekstrand2017-02-121-6/+4
| | | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "17.0 13.0" <[email protected]>
* i965/sampler_state: Clamp min/max LOD to 14 on gen7+Jason Ekstrand2017-02-121-2/+5
| | | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "17.0" <[email protected]>
* i965/fs: add support for int64 to bool conversionSamuel Iglesias Gonsálvez2017-02-091-2/+13
| | | | | | Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99660 Reviewed-by: Lionel Landwerlin <[email protected]>
* i965/fs: Add support for nir_op_[iu]2[iu]32Samuel Iglesias Gonsálvez2017-02-091-0/+4
| | | | | | Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99660 Reviewed-by: Lionel Landwerlin <[email protected]>
* i965/fs: Add support for nir_op_[iu]642fSamuel Iglesias Gonsálvez2017-02-091-0/+2
| | | | | | Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99660 Reviewed-by: Lionel Landwerlin <[email protected]>
* i965/fs: legalize [u]int64 to 32-bit data conversions in lower_d2xSamuel Iglesias Gonsálvez2017-02-091-1/+3
| | | | | | Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99660 Reviewed-by: Lionel Landwerlin <[email protected]>