summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* llvmpipe: convert double to long long instead of unsigned long longOded Gabbay2015-09-041-1/+1
| | | | | | | | | | | | | | | | | round(val*dscale) produces a double result, as val and dscale are double. However, LLVMConstInt receives unsigned long long, so there is an implicit conversion from double to unsigned long long. This is an undefined behavior. Therefore, we need to first explicitly convert the round result to long long, and then let the compiler handle conversion from that to unsigned long long. This bug manifests itself in POWER, where all IMM values of -1 are being converted to 0 implicitly, causing a wrong LLVM IR output. Signed-off-by: Oded Gabbay <[email protected]> CC: "10.6 11.0" <[email protected]> Reviewed-by: Tom Stellard <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* nv30: Implement color resolve for msaaHans de Goede2015-09-042-14/+8
| | | | | | | | | | | Note this is not ideal. Since the sifm can only do source sizes upto 1024x1024 we end up using the blitter on nv4x, which is not that fast. And on nv3x we end up using the cpu which is really slow. Cc: "10.6 11.0" <[email protected]> Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv30: Fix creation of scanout buffersHans de Goede2015-09-041-0/+10
| | | | | | | | | | | | | | | | | | | | | Scanout buffers on nv30 must always be non-swizzled and have special width alignment constraints. These constrains have been taken from the xf86-video-nouveau src/nv_accel_common.c: nouveau_allocate_surface() function. nouveau_allocate_surface() applies these width constraints only when a tiled attribute is set, which it sets for all surfaces allocated via dri, and this "tiling" is not the same as swizzling, scanout surfaces must be linear / have a uniform_pitch or only complete garbage is shown. This commit fixes dri3 on nv30 showing a garbled display, with dri3 the scanout buffers are allocated by mesa, rather then by the ddx, and the wrong stride of these buffers was causing the garbled display. Cc: "10.6 11.0" <[email protected]> Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* vc4: Initialize pack field of qreg to 0 in qir_get_tempBoyan Ding2015-09-041-0/+1
| | | | | | | | | | | | | This avoids generation of undefined packing in qir and qpu instructions, fixing a lot of rendering errors. Fixes 8b36d107fdd (vc4: Pack the unorm-packing bits into a src MUL instruction when possible.) Cc: [email protected] Signed-off-by: Boyan Ding <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* i965: Disallow PixelTransfer operations for tiled-memcpy TexImage/ReadPixelsChris Wilson2015-09-042-0/+8
| | | | | | | | | | | | | | | The tiled memcpy fast paths perform a simple blit (with only a couple of trivial pixel conversion routines) and do not accommodate PixelTransfer operations. Therefore if any are set, fallback to the regular routines. Note that PixelTransfer only applies to TexImage and ReadPixels, not to GetTexImage. Signed-off-by: Chris Wilson <[email protected]> Cc: Jason Ekstrand <[email protected]> Cc: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: [email protected]
* i965/vec4: Don't unspill the same register in consecutive instructionsIago Toral Quiroga2015-09-041-8/+118
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we have spilled/unspilled a register in the current instruction, avoid emitting unspills for the same register in the same instruction or consecutive instructions following the current one as long as they keep reading the spilled register. This should allow us to avoid emitting costy unspills that come with little benefit to register allocation. v2: - Apply the same logic when evaluating spilling costs (Curro). v3: - Abstract the logic that decides if a register can be reused in a function. that can be used from both spill_reg and evaluate_spill_costs (Curro). v4: - Do not disallow reusing scratch_reg in predicated reads (Curro). - Track if previous sources in the same instruction read scratch_reg (Curro). - Return prev_inst_read_scratch_reg at the end (Curro). - No need to explicitily skip scratch read/write opcodes in spill_reg (Curro). - Fix the comments explaining what happens when we hit an instruction that does not read or write scratch_reg (Curro) - Return true early when the current or previous instructions read scratch_reg with a compatible mask. v5: - Do not return true early, the loop should not be expensive anyway and this adds more complexity (Curro). Reviewed-by: Francisco Jerez <[email protected]>
* i965: Add a debug option for spilling everything in vec4 codeIago Toral Quiroga2015-09-044-5/+7
| | | | Reviewed-by: Francisco Jerez <[email protected]>
* dri/common: Tokenize driParseDebugString() argument before matching debug flags.Francisco Jerez2015-09-041-4/+13
| | | | | | | Fixes debug string parsing when one of the supported flags is a substring of another. Reviewed-by: Iago Toral Quiroga <[email protected]>
* dri/common: Fix codestyle of driParseDebugString().Francisco Jerez2015-09-041-8/+6
| | | | Reviewed-by: Iago Toral Quiroga <[email protected]>
* glsl: error out on ES 3.1 if VS or FS present but not bothTapani Pälli2015-09-041-4/+25
| | | | | Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* glsl: error on linking if no shaders are attached to programTapani Pälli2015-09-041-0/+19
| | | | | | | This applies to OpenGL Core >= 4.5 and OpenGL ES >= 3.1. Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* i965: Improve disassembly of data port read messages.Kenneth Graunke2015-09-031-4/+27
| | | | | | | | We now print out the name of the message instead of its numerical value, and label the message control and surface numbers. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Optimize VUE map comparisons.Kenneth Graunke2015-09-032-4/+4
| | | | | | | | | | | | The entire VUE map is computed based on the slots_valid bitfield; calling brw_compute_vue_map on the same bitfield will return the same result. So we can simply compare those. struct brw_vue_map is 136 bytes; doing a single 8-byte comparison is much cheaper and should work just as well. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/gs: Don't reserve space for clip plane uniforms.Kenneth Graunke2015-09-031-2/+0
| | | | | | | | These were only for legacy userclipping, which we no longer support in geometry shaders. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Don't do legacy userclipping in non-compatibility contexts.Kenneth Graunke2015-09-031-0/+1
| | | | | | | | | | | | | | | | | According to the GLSL 1.50 specification, page 76: "The shader must also set all values in gl_ClipDistance that have been enabled via the OpenGL API, or results are undefined." With this patch, we only enable clip distance writes when the shader actually writes them. We no longer force a value to be written when clip planes are enabled in the API. This could mean the first varying slot would be used as clip distances - I believe it should be the safe kind of undefined behavior. Empirically, it doesn't seem to cause a problem. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Remove the brw_vue_prog_key base class.Kenneth Graunke2015-09-039-63/+45
| | | | | | | | | The legacy userclip fields are only used for the vertex shader, and at that point there's only program_string_id and the tex struct, which are common to all keys. So there's no need for a "VUE" key base class. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Virtualize vec4_visitor::emit_urb_slot().Kenneth Graunke2015-09-034-16/+28
| | | | | | | | | | | | This avoids a downcast of key, which won't exist in the base class soon. I'm not a huge fan of this patch, but given that we're currently using inheritance, this seems like the "right" way to do it. The alternative is to make key a void pointer in the parent class and continue downcasting. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Store a key_tex pointer in vec4_visitor.Kenneth Graunke2015-09-033-8/+10
| | | | | | | | | I'm about to remove the base class for VS/GS/HS/DS program keys, at which point we won't be able to use key->tex anymore. Instead, we'll need to store a direct pointer (like we do in the FS backend). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Move legacy clip plane handling to vec4_vs_visitor.Kenneth Graunke2015-09-036-67/+74
| | | | | | | | | | | | | This is now only used for the vertex shader, so it makes sense to get it out of any paths run by the geometry shader. Instead of passing the gl_clip_plane array into the run() method (which is shared among all subclasses), we add it as a vec4_vs_visitor constructor parameter. This eliminates the bogus NULL parameter in the GS case. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Delete the brw_vue_program_key::userclip_active flag.Kenneth Graunke2015-09-035-22/+15
| | | | | | | | | | | | | | | | | There are two uses of this flag. The primary use is checking whether we need to emit code to convert legacy gl_ClipVertex/gl_Position clipping to clip distances. In this case, we also have to upload the clip planes as uniforms, which means setting nr_userclip_plane_consts to a positive value. Checking if it's > 0 works for detecting this case. Gen4-5 also wants to know whether we're doing clipping at all, so it can emit user clip flags. Checking if output_reg[VARYING_SLOT_CLIP_DIST0] is set to a real register suffices for this. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Remove legacy clip plane handling from geometry shaders.Kenneth Graunke2015-09-033-33/+8
| | | | | | | | | | | | | | | | | | We only support geometry shaders in core profiles, where gl_ClipVertex doesn't exist. Presumably the even older behavior of clipping to gl_Position isn't supported either. In fact, GLSL 1.50 page 76 claims: "The shader must also set all values in gl_ClipDistance that have been enabled via the OpenGL API, or results are undefined." So we don't need to handle legacy clipping in geometry shaders. I think Paul added this back when we were considering supporting the old GL_ARB_geometry_shader4 extension. This removes a non-orthagonal state dependency on GS compilation. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Move brw_setup_tex_for_precompile to brw_program.[ch].Kenneth Graunke2015-09-034-22/+23
| | | | | | | | | | | | This living in brw_fs.{h,cpp} is a historical artifact of us supporting texturing for fragment shaders before any other stages. It's kind of awkward given that we use it for all stages. This avoids having to include brw_fs.h in geometry shader code in order to access this function. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* mesa: change 'SHADER_SUBST' facility to work with env variablesTapani Pälli2015-09-042-38/+115
| | | | | | | | | | | | | | | | | | | Patch modifies existing shader source and replace functionality to work with environment variables rather than enable dumping on compile time. Also instead of _mesa_str_checksum, _mesa_sha1_compute is used to avoid collisions. Functionality is controlled via two environment variables: MESA_SHADER_DUMP_PATH - path where shader sources are dumped MESA_SHADER_READ_PATH - path where replacement shaders are read v2: cleanups, add strerror if fopen fails, put all functionality inside HAVE_SHA1 since sha1 is required Signed-off-by: Tapani Pälli <[email protected]> Suggested-by: Eero Tamminen <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* build: add HAVE_SHA1 define when using --with-sha1 optionTapani Pälli2015-09-041-0/+5
| | | | | Signed-off-by: Tapani Pälli <[email protected]> Acked-by: Brian Paul <[email protected]>
* i965: Fix copy propagation type changes.Kenneth Graunke2015-09-031-0/+1
| | | | | | | | | | | | | | | | | | | | | | commit 472ef9a02f2e5c5d0caa2809cb736a0f4f0d4693 introduced code to change the types of SEL and MOV instructions for moves that simply "copy bits around". It didn't account for type conversion moves, however. So it would happily turn this: mov(8) vgrf6:D, -vgrf5:D mov(8) vgrf7:F, vgrf6:UD into this: mov(8) vgrf6:D, -vgrf5:D mov(8) vgrf7:D, -vgrf5:D which erroneously drops the conversion to float. Cc: "11.0 10.6" <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* r600: fix loop overrun in cayman_mul_double_instrDave Airlie2015-09-041-1/+1
| | | | | | Coverity warned about this. Ilia pointed it out. Signed-off-by: Dave Airlie <[email protected]>
* i965/gen9: Annotate input coverage mask changeBen Widawsky2015-09-032-2/+22
| | | | | | | | | | | | | | | | | | | | | | | | | As far as I can tell, the behavior is preserved from the previous generations. Before we set a single bit to tell the FS whether or not we'll be using an input coverage mask. Now we have some options which are implementing various extensions. These bits are used for the various conservative rasterization mechanisms (for collision detection, binning, and whatever else). I believe that the behavior is preserved because the problem which conservative rasterization is attempting to fix would go away with the "NORMAL" mode (at the cost of performance, I believe). This patch serves as documentation of the change by creating the enums, as well as giving some of the history with the links here so that the next person who comes along and looks at it doesn't spend as long as I had to in order to determine if there is an issue or not. Previously, this algorithm had been done in software, and this can still be used as long as we don't export an extension stating otherwise. References: https://www.opengl.org/registry/specs/NV/conservative_raster.txt References: https://http.developer.nvidia.com/GPUGems2/gpugems2_chapter42.html Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* svga: update call to u_upload_alloc()Brian Paul2015-09-031-3/+3
| | | | | | u_upload_alloc() no longer returns a return value. Trivial.
* winsys/radeon: remove exported buffers from the cacheMarek Olšák2015-09-031-0/+3
| | | | | Cc: 11.0 <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* winsys/amdgpu: remove exported buffers from the cacheMarek Olšák2015-09-031-0/+3
| | | | | Cc: 11.0 <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* gallium/pb_bufmgr_cache: add a way to remove buffers from the cache explicitlyMarek Olšák2015-09-032-6/+41
| | | | | | | | This must be done before exporting a buffer as dmabuf fds, because we lose track of who is using it and can't trust the reference counter. Cc: 11.0 <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* u_upload_mgr: remove the return value from u_upload_dataMarek Olšák2015-09-034-25/+22
| | | | Reviewed-by: Brian Paul <[email protected]>
* u_upload_mgr: remove the return value from u_upload_bufferMarek Olšák2015-09-032-31/+18
| | | | Reviewed-by: Brian Paul <[email protected]>
* u_upload_mgr: remove the return value from u_upload_alloc_bufferMarek Olšák2015-09-031-11/+9
| | | | Reviewed-by: Brian Paul <[email protected]>
* u_upload_mgr: remove the return value from u_upload_allocMarek Olšák2015-09-037-44/+48
| | | | | | The return buffer or the returned pointer can be used instead. Reviewed-by: Brian Paul <[email protected]>
* u_upload_mgr: optimize u_upload_allocMarek Olšák2015-09-031-15/+17
| | | | | | | This is probably the most called util function. It does almost nothing, yet it can consume 10% of the CPU on the profile. This drops it down to 5%. Reviewed-by: Brian Paul <[email protected]>
* gallium/radeon: remove 'dirty' member from r600_atomGrazvydas Ignotas2015-09-034-6/+1
| | | | | | It's no longer used by both r600 and radeonsi now. Signed-off-by: Marek Olšák <[email protected]>
* r600g: simplify dirty atom trackingGrazvydas Ignotas2015-09-033-49/+14
| | | | | | | Now that R600_NUM_ATOMS is below 64, dirty atom tracking can be simplified. Signed-off-by: Marek Olšák <[email protected]>
* r600g: start numbering atoms from 1Grazvydas Ignotas2015-09-033-3/+3
| | | | | | | There doesn't seem any reason to start from 4. Start from 1 instead (0 is left reserved to catch uninitialized atoms). Signed-off-by: Marek Olšák <[email protected]>
* r600g: make all viewport states use single atomGrazvydas Ignotas2015-09-036-34/+38
| | | | | | | Similarly to scissor states, we can use single atom to track all viewport states. This will allow to simplify dirty atom handling later. Signed-off-by: Marek Olšák <[email protected]>
* r600g: apply disable workaround on all scissorsGrazvydas Ignotas2015-09-032-9/+14
| | | | | | | | During review of the "r600g: make all scissor states use single atom" patch Marek Olšák noticed that scissor disable workaround should be applied on all scissor states and not just first one, so let's do so. Signed-off-by: Marek Olšák <[email protected]>
* r600g: make all scissor states use single atomGrazvydas Ignotas2015-09-036-40/+62
| | | | | | | As suggested by Marek Olšák, we can use single atom to track all scissor states. This will allow to simplify dirty atom handling later. Signed-off-by: Marek Olšák <[email protected]>
* mesa/pbo: Handle zero width, height or depth when validating accessNeil Roberts2015-09-031-0/+6
| | | | | | | | | | | | | | | It's legal to call glTexSubImage with zero values for the width, height or depth. Previously this was breaking the PBO access validation because it tries to work out the last pixel accessed by getting the pixel at height-1 and depth-1 which would end up with bogus values. This was causing GL errors to be generated during the Piglit texsubimage test, although the test was passing anyway. v2: Also check for width == 0. Don't validate the start pointer if any of the dimensions are zero. Reviewed-by: Ilia Mirkin <[email protected]>
* glsl: Remove unused total_attribs_size variable.Kenneth Graunke2015-09-031-1/+0
| | | | | | Accidentally left behind by my previous patch. Signed-off-by: Kenneth Graunke <[email protected]>
* glsl: Handle attribute aliasing in attribute storage limit check.Kenneth Graunke2015-09-021-28/+36
| | | | | | | | | | | | | | | | | | | | | | | | | In various versions of OpenGL and GLSL, it's possible to declare multiple VS input variables with aliasing attribute locations. So, when computing the storage requirements for vertex attributes, we can't simply add up the sizes. Instead, we need to look at the enabled slots. This patch begins tracking which attributes are double types that are larger than 128-bits (i.e. take up two vec4 slots). We then count normal attributes once, and count the double-size attributes a second time. Fixes deQP functional.attribute_location.bind_aliasing.max_cond_* tests on i965, which regressed with commit ad208d975a6d3aebe14f7c2c16039ee20. No Piglit changes on llvmpipe (which actually supports dvecs). Cc: "10.6 11.0" <[email protected]> Tested-by: Mark Janes <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]>
* i965/meta: Fix typo in commentIan Romanick2015-09-021-1/+1
| | | | | | Trivial. Signed-off-by: Ian Romanick <[email protected]>
* mesa: Don't allow wrong type setters for matrix uniformsIan Romanick2015-09-021-0/+25
| | | | | | | | | | | | | Previously we would allow glUniformMatrix4fv on a dmat4 and glUniformMatrix4dv on a mat4. Both are illegal. That later also overwrites the storage for the mat4 and causes bad things to happen. Should fix the (new) arb_gpu_shader_fp64-wrong-type-setter piglit test. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Cc: Dave Airlie <[email protected]> Cc: "10.6 11.0" <[email protected]>
* mesa: Pass the type to _mesa_uniform_matrix as a glsl_base_typeIan Romanick2015-09-023-42/+42
| | | | | | | | | | | | This matches _mesa_uniform, and it enables the bug fix in the next patch. v2: s/type/basicType/ in the assert in _mesa_uniform_matrix. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> [v1] Cc: Dave Airlie <[email protected]> Cc: "10.6 11.0" <[email protected]>
* mesa: Silence unused parameter warnings in bufferobj.cIan Romanick2015-09-021-0/+2
| | | | | | | | | | | | | | main/bufferobj.c: In function 'count_buffer_size': main/bufferobj.c:520:26: warning: unused parameter 'key' [-Wunused-parameter] count_buffer_size(GLuint key, void *data, void *userData) ^ main/bufferobj.c: In function 'flush_mapped_buffer_range_fallback': main/bufferobj.c:740:56: warning: unused parameter 'index' [-Wunused-parameter] gl_map_buffer_index index) ^ Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* mesa: Remove target parameter from _mesa_handle_bind_buffer_genIan Romanick2015-09-023-7/+4
| | | | | | | | | | main/bufferobj.c: In function '_mesa_handle_bind_buffer_gen': main/bufferobj.c:915:37: warning: unused parameter 'target' [-Wunused-parameter] GLenum target, ^ Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>