summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* ilo: add kernel queries for compute shadersChia-I Wu2014-11-063-0/+37
| | | | | | | We need to know the local/input/private sizes and others. This is not complete. We need many others for CURBE setup. Signed-off-by: Chia-I Wu <[email protected]>
* ilo: fix compute paramsChia-I Wu2014-11-061-12/+36
| | | | | | Based on beignet, hardware capabilities, and OpenCL requirements. Signed-off-by: Chia-I Wu <[email protected]>
* ilo: add eu_count and thread_count to ilo_dev_infoChia-I Wu2014-11-063-55/+77
| | | | | | | They will be used to report compute params or program compute states. thread_count can also be used for 3DSTATE_VS. Signed-off-by: Chia-I Wu <[email protected]>
* ilo: fix intel_bo_wait() on kernel 3.17Chia-I Wu2014-11-061-1/+7
| | | | | | drm_intel_gem_bo_wait() with negative timeout is broken on kernel 3.17. Signed-off-by: Chia-I Wu <[email protected]>
* ilo: add drm_configuration for the pipe-targetNick Sarnie2014-11-041-1/+22
| | | | Allows the driver to advertise DMA-BUF and throttling.
* clover: Fix clBuildProgram piglit regressionTom Stellard2014-11-031-4/+4
| | | | | | | | | | | Should trigger CL_INVALID_VALUE if device_list is NULL and num_devices is greater than zero. Introduced by e5468dfa523be2a7a0d04bb9efcf8ae780957563 Reported by: EdB Reviewed-by: Francisco Jerez <[email protected]>
* gallivm: Disable frame-pointer-omission on x86 to ensure right stack alignment.José Fonseca2014-11-031-1/+3
| | | | | | | | | | | | | | | | | | | Between release 3.2 and 3.3 LLVM stopped aligning properly when certain conditions (no allocas, but large number of vectors causing spills to the stack, and frame pointer omission enabled). We were already disabling frame-pointer-omission on several build types, but we now disable it on all build types. It's not clear whether this affects 32-bits x86 processes only, or if it can also affect 64-bits x86_64 processes when AVX registers are available and used. So disable frame-pointer-omission on both x86/x86_64 to be on the safe side. See also: - http://llvm.org/PR21435 Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: When disassemble a function, start by printing out its name.José Fonseca2014-11-031-0/+1
| | | | | | To help recognize what's supposed to do. Reviewed-by: Roland Scheidegger <[email protected]>
* gallium/docs: fix NRM, NRM4 docsBrian Paul2014-11-011-12/+24
| | | | | | | | | Need to do a sqrt(). FWIW, the html that Sphinx 1.1.3 generates for the math expressions looks completely broken. Reviewed-by: José Fonseca <[email protected]>
* softpipe: use the tgsi_free_tokens() functionBrian Paul2014-10-311-6/+6
| | | | Reviewed-by: Charmaine Lee <[email protected]>
* tgsi: add a tgsi_free_tokens() functionBrian Paul2014-10-312-0/+13
| | | | | | To match tgsi_alloc_tokens(). Reviewed-by: Charmaine Lee <[email protected]>
* util: simplify u_pstipple.c codeBrian Paul2014-10-311-123/+62
| | | | | | | Use the new helper functions in the tgsi_transform.h file to emit declarations and instructions. Reviewed-by: Charmaine Lee <[email protected]>
* util: simplify temp register selection in u_pstipple.cBrian Paul2014-10-311-27/+18
| | | | Reviewed-by: Charmaine Lee <[email protected]>
* util: simplify util_pstipple_create_fragment_shader() paramsBrian Paul2014-10-313-38/+28
| | | | | | | | Pass and return tgsi_token buffers instead of pipe_shader_state. And update softpipe driver (the only user of this function). Reviewed-by: Charmaine Lee <[email protected]>
* softpipe: remove unused softpipe_create_fs_variant_exec() parameterBrian Paul2014-10-313-5/+3
| | | | Reviewed-by: Charmaine Lee <[email protected]>
* softpipe: check for SP_NEW_STIPPLE when building quad pipelineBrian Paul2014-10-311-0/+1
| | | | | | | Fixes polygon stipple if both DO_PSTIPPLE_IN_DRAW_MODULE and DO_PSTIPPLE_IN_HELPER_MODULE are zero/off. Reviewed-by: Charmaine Lee <[email protected]>
* r600g: Fix build with opencl and radeonsi disabledTom Stellard2014-10-311-6/+6
|
* clover: Fix bug when binary programs are passed to clBuildProgram() v2Tom Stellard2014-10-312-6/+14
| | | | | | | | | | | | | This was a regression introduced by 611d66fe4513e53bde052dd2bab95d448c909a2a Passing a binary program to clBuildProgram() is legal, but passing one to clCompileProgram() is not. v2: - Code cleanups. Reviewed-by: Francisco Jerez <[email protected]>
* clover: Factor input validation of clCompileProgram into a new function v2Tom Stellard2014-10-311-10/+23
| | | | | | | | | This factors out the validation that is common with clBuildProgram(). v2: - Code cleanups. Reviewed-by: Francisco Jerez <[email protected]>
* radeonsi/compute: Enable PIPE_SHADER_IR_NATIVE for compute shaders v2Tom Stellard2014-10-314-59/+127
| | | | | | v2: - Drop dependency on LLVM >= 3.5.1 - Rename si_create_shader() to si_shader_binary_read()
* r600g/compute: Enable PIPE_SHADER_IR_NATIVE for compute shaders v2Tom Stellard2014-10-318-97/+180
| | | | | v2: - Drop dependency on LLVM >= 3.5.1
* gallium/radeon: Add query for symbol specific config informationTom Stellard2014-10-313-0/+86
| | | | | | | This adds a query which allows drivers to access the config information of a specific function within the LLVM generated ELF binary. This makes it possible for the driver to handle ELF binaries with multiple kernels / global functions.
* r300g: remove enabled/disabled hyperz and AA compression messagesMarek Olšák2014-10-301-2/+0
| | | | | | It's annoying with octave. Reported by Michael Burian. Cc: 10.2 10.3 <[email protected]>
* r600g: Delete unused variable 'max_global_size' in 'r600_get_compute_param'Dieter Nützel2014-10-301-1/+0
| | | | Signed-off-by: Dieter Nützel <[email protected]>
* radeon/llvm: Dynamically allocate branch/loop stack arraysMichel Dänzer2014-10-292-6/+37
| | | | | | | | | | | This prevents us from silently overflowing the stack arrays, and allows arbitrary stack depths. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85454 Cc: [email protected] Reported-and-Tested-by: Nick Sarnie <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* vc4: Add support for ARL and indirect register access on TGSI_FILE_CONSTANT.Eric Anholt2014-10-2810-34/+407
| | | | | Fixes 14 ARB_vp tests (which had no lowering done), and should improve performance of indirect uniform array access in GLSL.
* vc4: Fix mixup of return type in reloc_tex().Eric Anholt2014-10-281-2/+2
|
* vc4: Drop redundant check for is_tmu_write().Eric Anholt2014-10-281-3/+0
| | | | This function is only called when it would return true.
* vc4: Don't forget to validate code that's got PROG_END on it.Eric Anholt2014-10-281-5/+6
| | | | | This signal doesn't terminate the program now, it terminates the program soon. So you have to actually validate the code in the instruction.
* vc4: Add .dir-locals.el for kernel style in the kernel code.Eric Anholt2014-10-281-0/+12
|
* vc4: Fix a couple missing '\n's in error output.Eric Anholt2014-10-282-2/+2
|
* r300g/vdpau: enable againDavid Heidelberger2014-10-281-0/+1
| | | | | Signed-off-by: David Heidelberger <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* r300g: only set clip_halfz for chips with HW TCLMarek Olšák2014-10-281-1/+1
| | | | | I forgot that we cannot emit vertex shader state on a chip without VS. In such a case, clip_halfz is handled by the Draw module.
* radeonsi: fix incorrect index buffer max size for lowered 8-bit indicesMarek Olšák2014-10-281-1/+1
| | | | | Cc: 10.2 10.3 [email protected] Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: fix polygon mode for points and lines and point/line fill modesMarek Olšák2014-10-281-3/+3
| | | | | | | Fixes piglit/polygon-mode-offset. Cc: 10.2 10.3 [email protected] Reviewed-by: Michel Dänzer <[email protected]>
* r600g: fix polygon mode for points and lines and point/line fill modesMarek Olšák2014-10-282-6/+6
| | | | | | | Fixes piglit/polygon-mode-offset. Cc: 10.2 10.3 [email protected] Reviewed-by: Michel Dänzer <[email protected]>
* r600g: Implement sm5 UBO/sampler indexingGlenn Kennard2014-10-287-19/+164
| | | | | | | | | Caveat: Shaders using UBO/sampler indexing will not be optimized by SB, due to SB not currently supporting the necessary CF_INDEX_[01] index registers. Signed-off-by: Glenn Kennard <[email protected]>
* r600g: Implement sm5 interpolation functionsGlenn Kennard2014-10-282-3/+237
| | | | | | Requires evergreen/cayman Signed-off-by: Glenn Kennard <[email protected]>
* nv50: handle inverted render conditionsTobias Klausmann2014-10-264-10/+51
| | | | | | | This enables ARB_conditional_render_inverted. Signed-off-by: Tobias Klausmann <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* freedreno/ir3: consider instruction neighbors in cpRob Clark2014-10-252-11/+178
| | | | | | | | | | | | | | | | | Fanin (merge) nodes require it's srcs to be "adjacent" in consecutive scalar registers. Keep track of instruction neighbors in copy- propagation step and avoid eliminating mov's which would cause an instruction to need multiple distinct left and/or right neighbors. This lets us not fall on our face when we encounter things like: 1: MOV TEMP[2], IN[0].xyzw 2: TEX OUT[0].xy, TEMP[2], SAMP[0], SHADOW2D 3: MOV TEMP[2].xy, IN[0].yxzz 4: TEX OUT[0].zw, TEMP[2], SAMP[0], SHADOW2D 5: END Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: always mov tex coordsRob Clark2014-10-251-54/+30
| | | | | | | | | | | | | | | | Always insert extra mov's for the tex coord into the fanin. This simplifies things a bit, and avoids a scenario where multiple sam instructions can have mutually exclusive input's to it's fanin, for example: 1: TEX OUT[0].xy, IN[0].xyxx, SAMP[0], 2D 2: TEX OUT[0].zw, IN[0].yxxx, SAMP[0], 2D The CP pass can always remove the mov's that are not actually needed, so better to start out with too many mov's in the front end, than not enough. Signed-off-by: Rob Clark <[email protected]>
* freedreno: rename a couple debug flagsRob Clark2014-10-253-7/+7
| | | | | | | | | dscis -> noscis dbypass -> nobypass a bit more consistant w/ nobin, etc. And IMO a bit more sensible names. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: skip virtual outputs in standalone compilerRob Clark2014-10-251-0/+3
| | | | | | | Kills get added to the outputs list, to ensure they get scheduled. But they aren't *really* outputs so skip them in the header comment block. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: standalone compiler updates for ir3testRob Clark2014-10-254-18/+51
| | | | | | | | | | | | | | | In order to test compiler changes more easily, spit out the assembled shader with some header information so that we can know about inputs/outputs more easily. See: git://people.freedesktop.org/~robclark/ir3test In ir3test we have a big collection of tgsi shaders and reference ir3_compiler outputs. When making compiler changes, regenerate the compiler outputs and feed to ir3test to compare the new vs reference shader. Signed-off-by: Rob Clark <[email protected]>
* ilo: improve blob decodingChia-I Wu2014-10-251-8/+31
| | | | | | | The last few dwords were skipped if the total number of dwords was not a multiple of 4. Change the formatting for better readability. Signed-off-by: Chia-I Wu <[email protected]>
* llvmpipe: Ensure the packed input of the lp_test_format is aligned.José Fonseca2014-10-241-2/+10
| | | | | | | | Fixes: - https://bugs.freedesktop.org/show_bug.cgi?id=85377 - http://llvm.org/bugs/show_bug.cgi?id=21365 Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: Flush stdout on lp_test_* unit tests.José Fonseca2014-10-242-0/+3
| | | | | | | So that the order of test messages and gallivm/llvmpipe debug output is preserved. Reviewed-by: Roland Scheidegger <[email protected]>
* gallium: introduce PIPE_CAP_CLIP_HALFZ.Mathias Fröhlich2014-10-2415-0/+20
| | | | | | | | | | | | In preparation of ARB_clip_control. Let the driver decide if it supports pipe_rasterizer_state::clip_halfz being set to true. v3: Initially enable on ilo. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Mathias Froehlich <[email protected]
* vc4: Reuse uniform_data/contents indices when making uniforms.Eric Anholt2014-10-241-0/+7
| | | | | | | | | | This allows vc4_opt_cse.c to CSE-away operations involving the same uniform values. total instructions in shared programs: 37341 -> 36906 (-1.16%) instructions in affected programs: 10233 -> 9798 (-4.25%) total uniforms in shared programs: 10523 -> 10320 (-1.93%) uniforms in affected programs: 2467 -> 2264 (-8.23%)
* vc4: When asked to discard-map a whole resource, discard it.Eric Anholt2014-10-241-14/+28
| | | | | | | This saves a bunch of extra flushes when texsubimaging a whole texture that's been used for rendering, or subdataing a whole BO. In particular, this massively reduces the runtime of piglit texture-packed-formats (when the probes have been moved out of the inner loop).