aboutsummaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers/dri
Commit message (Collapse)AuthorAgeFilesLines
...
* i965/fs: Don't constant propagate into integer math instructions.Matt Turner2015-04-242-3/+5
| | | | | | Constant combining won't promote non-floats, so this isn't safe. Fixes regressions since commit 0087cf23e.
* i965/fs: Allow 2-src math instructions to have immediate src1.Matt Turner2015-04-242-7/+11
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Add an INTEL_DEBUG=spill option to test spillingJason Ekstrand2015-04-233-1/+3
| | | | | Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/debug: Use the ull specifier for DEBUG enum definesJason Ekstrand2015-04-231-31/+31
| | | | | | | The INTEL_DEBUG variable is a uint64_t and if we want a enum value higer than 32 bits, you need to use ull. We might as well use it for all of them. Reviewed-by: Matt Turner <[email protected]>
* i965: Disallow linear blits that are not cacheline aligned.Kenneth Graunke2015-04-231-8/+19
| | | | | | | | | | | | | | | The BLT engine on Gen8+ requires linear surfaces to be cacheline aligned. This restriction was added as part of converting the BLT to use 48-bit addressing. The main user, intel_emit_linear_blit, now handles this properly. But we might also have linear miptrees; just refuse to blit those. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88521 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Cc: [email protected]
* i965: Make intel_emit_linear_blit handle Gen8+ alignment restrictions.Kenneth Graunke2015-04-231-8/+14
| | | | | | | | | | | | | | | | | | | | The BLT engine on Gen8+ requires linear surfaces to be cacheline aligned. This restriction was added as part of converting the BLT to use 48-bit addressing. intel_emit_linear_blit needs to handle blits that are not cacheline aligned, as we use it for arbitrary glBufferSubData calls and subrange mappings. Since intel_emit_linear_blit uses 1 byte per pixel, we can use the src/dst pixel X offset field to represent the unaligned portion, and subtract that from the address so it's cacheline aligned. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88521 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Cc: [email protected]
* i965/nir: Use the correct offsets when handling register indirectsJason Ekstrand2015-04-221-27/+27
| | | | Reviewed-by: Connor Abbott <[email protected]>
* i965: Add a brw_compiler structure and store the register sets in itJason Ekstrand2015-04-227-97/+120
| | | | Reviewed-by: Matt Turner <[email protected]>
* i965: Rename brw_compile to brw_codegenJason Ekstrand2015-04-2229-246/+246
| | | | | | | | | | | | This name better matches what it's actually used for. The patch was generated with the following command: for file in *; do sed -i -e s/brw_compile/brw_codegen/g $file done Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Use device_info instead of the context for computing vue mapsJason Ekstrand2015-04-224-7/+12
| | | | Reviewed-by: Matt Turner <[email protected]>
* i965: Use device_info instead of the context in instruction schedulingJason Ekstrand2015-04-223-14/+13
| | | | Reviewed-by: Matt Turner <[email protected]>
* i965: Add a devinfo field to backend_visitor and use it for gen checksJason Ekstrand2015-04-2219-221/+225
| | | | Reviewed-by: Matt Turner <[email protected]>
* i965: Remove remaining uses of ctx->Const.UniformBooleanTrue in visitorsJason Ekstrand2015-04-222-12/+6
| | | | | | | | | | Since commit 2881b123, we have used 0/~0 for representing booleans on all gens. However, we still had a bunch of places in the visitor code where we were still referring to ctx->Const.UniformBooleanTrue. Since this is always ~0, we can just remove them. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/vec4: Add a devinfo field to the generator and use it for gen checksJason Ekstrand2015-04-222-46/+42
| | | | Reviewed-by: Matt Turner <[email protected]>
* i965/fs: Add a devinfo field to the generator and use it for gen checksJason Ekstrand2015-04-222-59/+58
| | | | Reviewed-by: Matt Turner <[email protected]>
* i965/device_info: Add a supports_simd16_3src flagJason Ekstrand2015-04-226-55/+56
| | | | | | | | | | This also involves moving revision checking to screen creation time and passing that into brw_get_device_info so that we can get the right device_info for early versions of SKL. Since the only place we used revision was to check for SIMD16 3-src instruction support, it's safe to remove the revision field from brw_context. Reviewed-by: Matt Turner <[email protected]>
* i965/device_info: Add a HSW_FEATURES macroJason Ekstrand2015-04-221-3/+7
| | | | | | It's basically just a copy of GEN7_FEATURES only with is_haswell set Reviewed-by: Matt Turner <[email protected]>
* i965: Make the annotation code take a device_info instead of a contextJason Ekstrand2015-04-224-10/+14
| | | | Reviewed-by: Matt Turner <[email protected]>
* i965/fs: Remove the GL context from the generatorJason Ekstrand2015-04-222-11/+1
| | | | Reviewed-by: Matt Turner <[email protected]>
* i965: Remove the context field from brw_compilerJason Ekstrand2015-04-2215-63/+42
| | | | Reviewed-by: Matt Turner <[email protected]>
* i965: Make the disassembler take a device_info instead of a contextJason Ekstrand2015-04-2211-109/+99
| | | | Reviewed-by: Matt Turner <[email protected]>
* i965: Make instruction compaction take a device_info instead of a contextJason Ekstrand2015-04-224-109/+112
| | | | Reviewed-by: Matt Turner <[email protected]>
* i965: Make the brw_inst helpers take a device_info instead of a contextJason Ekstrand2015-04-2216-995/+1006
| | | | Reviewed-by: Matt Turner <[email protected]>
* i965/eu: Add a devinfo parameter to brw_compileJason Ekstrand2015-04-222-0/+2
| | | | Reviewed-by: Matt Turner <[email protected]>
* i965: Do better fake context setup in unit testsJason Ekstrand2015-04-225-5/+24
| | | | | | | In future tests, we will start relying on devinfo and not just brw in the compiler. Changing this now keeps these tests from failing in the future. Reviewed-by: Matt Turner <[email protected]>
* i965: Remove the context parameter from brw_texture_offsetJason Ekstrand2015-04-225-12/+5
| | | | | | | | It wasn't really being used anyway. We used it to assert that gpu_shader5 is supported in the back-end but that should be caught by the front-end. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* android: xmlpool: cleanup the generation rulesEmil Velikov2015-04-221-3/+2
| | | | | | | | | - Do not attempt to create the save folder twice - both dir $@ and PRIVATE_LOCALEDIR point to the same place. - Use @ and $(hide), for mkdir and python, to avoid spamming the output. Signed-off-by: Emil Velikov <[email protected]>
* android: xmlpool: Get rid of the last use of intermediates-dir-forChih-Wei Huang2015-04-222-10/+7
| | | | | | | | | v2 [Emil Velikov] - Keep the PRIVATE_LOCALEDIR variable. - Do not use $(@D) but the more widespead $(dir $@) Signed-off-by: Chih-Wei Huang <[email protected]> Signed-off-by: Emil Velikov <[email protected]>
* android: export the path of the generated headersChih-Wei Huang2015-04-222-1/+2
| | | | | | | The modules need the headers can get the path automatically. Signed-off-by: Chih-Wei Huang <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* android: fix the building rules for Android 5.0Chih-Wei Huang2015-04-223-2/+9
| | | | | | | | | | | | | | | | | Android 5.0 allows modules to generate source into $OUT/gen, which will then be copied into $OUT/obj and $OUT/obj_$(TARGET_2ND_ARCH) as necessary. Modules will need to change calls to local-intermediates-dir into local-generated-sources-dir. The patch changes local-intermediates-dir into local-generated-sources-dir. If the Android version is less than 5.0, fallback to local-intermediates-dir. The patch also fixes the 64-bit building issue of Android 5.0. v2 [Emil Velikov] - Keep the LOCAL_UNSTRIPPED_PATH variable. Signed-off-by: Chih-Wei Huang <[email protected]>
* android: dri: link against libmesa_utilEmil Velikov2015-04-221-1/+2
| | | | | | | | The dri modules depend on symbols provided by it. Cc: "10.5" <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Chih-Wei Huang <[email protected]>
* android: add gallium dirs to more places in the treeEmil Velikov2015-04-221-0/+2
| | | | | | | | Similar to e8c5cbfd921(mesa: Add gallium include dirs to more parts of the tree.) Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Chih-Wei Huang <[email protected]>
* android: dri/common: conditionally include drm_cflags/set __NOT_HAVE_DRM_HEmil Velikov2015-04-221-0/+14
| | | | | | | Otherwise we'll fail to find the drm.h header. Cc: "10.4 10.5" <[email protected]> Signed-off-by: Emil Velikov <[email protected]>
* android: add $(mesa_top)/src include to the whole of mesaEmil Velikov2015-04-221-1/+0
| | | | | | | | | Many parts of mesa already have the include with others depending on it but it's missing. Add it once at the top makefile and be done with it. Cc: "10.4 10.5" <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Chih-Wei Huang <[email protected]>
* android: use LOCAL_SHARED_LIBRARIES over TARGET_OUT_HEADERSEmil Velikov2015-04-221-1/+0
| | | | | | | | | ... to manage the LIBDRM*_CFLAGS. The former is the recommended approach by the Android build system developers while the latter has been depreciated for quite some time. Cc: "10.4 10.5" <[email protected]> Signed-off-by: Emil Velikov <[email protected]>
* drirc: Add "Second Life" quirk (allow_glsl_extension_directive_midshader).Kenneth Graunke2015-04-211-0/+4
| | | | | | | | | | | | Appears to fix shader compilation. Tested by starting the client, dragging the "quality and speed" slider back and forth, and watching the console output - instead of piles of "shader failed to compile", the CPU seems to be busy compiling shaders. I haven't actually tried to play. Signed-off-by: Kenneth Graunke <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69226 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71591 Cc: [email protected]
* i965/fs: Combine pixel center calculation into one inst.Matt Turner2015-04-213-20/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | The X and Y values come interleaved in g1 (.4-.11 inclusive), so we can calculate them together with a single add(32) instruction on some platforms like Broadwell and newer or in SIMD8 elsewhere. Note that I also moved the PIXEL_X/PIXEL_Y virtual opcodes from before LINTERP to after it. That's because the writes_accumulator_implicitly() function in backend_instruction tests for <= LINTERP for determining whether the instruction indeed writes the accumulator implicitly. The old FS_OPCODE_PIXEL_X/Y emitted ADD instructions, which did, but the new opcodes just emit MOVs, which don't. It doesn't matter, since we don't use these opcodes on Gen4/5 anymore, but in the case that we do... On Broadwell: total instructions in shared programs: 7192355 -> 7186224 (-0.09%) instructions in affected programs: 1190700 -> 1184569 (-0.51%) helped: 6131 On Haswell: total instructions in shared programs: 6155979 -> 6152800 (-0.05%) instructions in affected programs: 652362 -> 649183 (-0.49%) helped: 3179 Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Calculate delta_x and delta_y together.Matt Turner2015-04-217-74/+79
| | | | | | | | | | | | | This lets SIMD16 programs on G45 and Gen5 use the PLN instruction. On Ironlake: total instructions in shared programs: 5634757 -> 5518055 (-2.07%) instructions in affected programs: 1745837 -> 1629135 (-6.68%) helped: 11439 HURT: 4 Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Emit ADDs for gl_FragCoord, not virtual opcodes.Matt Turner2015-04-215-51/+8
| | | | | | | | | | These were used only on Gen4 and 5. emit_interpolation_setup_gen6() emits ADDs directly. The virtual opcodes weren't providing anything useful. I'm going to repurpose these opcodes, so deleting and readding them makes it simpler to see what's going on. Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Manually set source regioning on PLN instructions.Matt Turner2015-04-211-1/+13
| | | | | | | | Like LINE (commit 92346db0), src0 must have a scalar region. Setting src1's region to <8,8,1> lets us pass a properly sized combined delta_xy argument in a few commits without getting a bogus <16,16,1> region. Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Add LINTERP's src0 to fs_inst::regs_read().Matt Turner2015-04-211-11/+2
| | | | | | | | | | LINTERP's src0 is PLN's src1, and PLN's src1 reads exec_size / 4 registers. Having that information lets us drop the delta_x/y special case code in split_virtual_grfs(). Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Set compression only if writing two registers.Matt Turner2015-04-211-1/+4
| | | | | | | We don't want to set compression control on a SIMD16 instruction operating on words or smaller. Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Allow an execution size of 32.Matt Turner2015-04-212-1/+2
| | | | | | | In a few commits, we'll start emitting an add(32) instruction on some platforms. Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Make type_sz() return unsigned.Matt Turner2015-04-211-1/+1
| | | | | | Avoids annoying warnings when comparing with sizeof(...). Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Replace guess_execution_size with something simpler.Matt Turner2015-04-214-27/+35
| | | | | | | | | | | | | | | | | | | | | | | guess_execution_size() does two things: 1. Cope with small destination registers. 2. Cope with SIMD8 vs SIMD16 mode. This patch replaces the first with a simple if block in brw_set_dest: if the destination register width is less than 8, you probably want the execution size to match. (I didn't put this in the 3src block because it doesn't seem to matter.) Since only the FS compiler cares about SIMD16 mode, it's easy to just set the default execution size there. This pattern was already been proven in the Gen8+ generator, but we didn't port it back to the existing generator when we combined the two. This is based on a patch from Ken from about a year ago. I've rebased it and and fixed a few bugs. Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Ensure delta_x/y are even-aligned registers on Gen6.Matt Turner2015-04-212-3/+3
| | | | | | The BSpec says this applies to Gen6 as well. Reviewed-by: Jason Ekstrand <[email protected]>
* radeon: replace __FUNCTION__ with __func__Marius Predut2015-04-2132-116/+116
| | | | | | | | | Consistently just use C99's __func__ everywhere. No functional changes. Signed-off-by: Marius Predut <[email protected]> Acked-by: Michel Dänzer <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* i965/skl: Fix the qpitch valueNeil Roberts2015-04-202-13/+59
| | | | | | | | | | | | | | | | | | On Skylake the qpitch value is uploaded as part of the surface state so we don't need to add the extra rows that are done for other generations. However for 3D textures it needs to be aligned to the tile height and for depth/stencil textures it needs to be a multiple of 8. Unlike previous generations the qpitch is measured as a multiple of the block size for compressed surfaces. When the horizontal mipmap layout is used for 1D textures then the qpitch is measured in pixels instead of rows. v2: Align the depth/stencil textures to a multiple of 8 v3: Add an assert that ALL_SLICES_AT_EACH_LOD is not used. Ignore the vertical alignment when picking the qpitch for 1D_ARRAY textures. Reviewed-by: Ben Widawsky <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* i965/skl: Don't use ALL_SLICES_AT_EACH_LODNeil Roberts2015-04-201-10/+20
| | | | | | | | | | | | | | | | | | | | | The render surface state command for Skylake doesn't have the surface array spacing bit so it's not possible to select this layout. I think it was only used in order to make it pick a tightly-packed qpitch value that doesn't include space for the mipmaps. However this won't be necessary after the next patch because it will automatically pick a packed qpitch value whenever first_level==last_level. It is better to remove this layout entirely on Gen8+ because although it can effectively be implemented with a small qpitch value when there are no mipmaps it isn't possible to support the case where there are mipmaps because in that case the layout is very different. It could be good to make a similar change for Gen8 if we also change the layouting code to pick the qpitch value in a similar way. v2: Make the commit message and comments more convincing Reviewed-by: Ben Widawsky <[email protected]> Tested-by: Ben Widawsky <[email protected]>
* i965: Issue perf_debug messages for unsynchronized maps on !LLC systems.Kenneth Graunke2015-04-171-5/+11
| | | | | | | | | | | | | | | | | We haven't implemented proper unsynchronized map support on !LLC systems (pre-SNB, Atom). MapBufferRange with GL_MAP_UNSYNCHRONIZE_BIT will actually do a synchronized map, probably killing performance. Also warn on BufferSubData, when we should be doing an unsynchronized upload, but instead have to do a synchronous map. v2: Only complain if the buffer is actually busy - we use unsynchronized maps internally for vertex upload and such, but expect those to not be busy. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ben Widawsky <[email protected]> Tested-by: Ben Widawsky <[email protected]>