summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* intel: Allocate prog_data::[pull_]param deeper inside the compilerJason Ekstrand2017-10-129-88/+55
| | | | | | | | | | | | | Now that we're always growing the param array as-needed, we can allocate the param array in common code and stop repeating the allocation everywere. In order to keep things sane, we ralloc the [pull_]param array off of the compile context and then steal it back to a NULL context later. This doesn't get us all the way to where prog_data::[pull_]param is purely an out parameter of the back-end compiler but it gets us a lot closer. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* ralloc: Allow reparenting to a NULL contextJason Ekstrand2017-10-121-1/+1
| | | | | | | Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* anv/pipeline: Refactor setup of the prog_data::param arrayJason Ekstrand2017-10-121-14/+9
| | | | | | | | Now that the only thing we put in the array up-front are client push constants, we can simplify anv_pipeline_compile a bit. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* anv/pipeline: Grow the param array for imagesJason Ekstrand2017-10-122-7/+5
| | | | | | | | Before, we were calculating up-front and then filling in later. Now we just grow as needed in anv_nir_apply_pipeline_layout. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* anv/pipeline: Whack nir->num_uniforms to MAX_PUSH_CONSTANT_SIZEJason Ekstrand2017-10-121-4/+2
| | | | | | | | | | | | This way any image uniforms end up having locations higher than MAX_PUSH_CONSTANT_SIZE. There's no bug here at the moment, but this consistency will make the next commit easier. Also, because nir_apply_pipeline_layout properly increments nir->num_uniforms when it expands the param array, we no longer need to stomp it to match prog_data::nr_params because it already does. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/vs: Grow the param array for clip planesJason Ekstrand2017-10-123-5/+14
| | | | | | | | Instead of requiring the caller of brw_compile_vs to figure it out, just grow the param array on-demand. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/cs: Grow prog_data::param on-demand for thread_local_id_indexJason Ekstrand2017-10-124-22/+9
| | | | | | | | | | Instead of making the caller of brw_compile_cs add something to the param array for thread_local_id_index, just add it on-demand in brw_nir_intrinsics and grow the array. This is now safe to do because everyone is now using ralloc for prog_data::param. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/compiler: Make brw_nir_lower_intrinsics compute-specificJason Ekstrand2017-10-125-19/+13
| | | | | | | | | It's already only ever called from brw_compile_cs and only handles compute intrinsics. Let's just make it CS-specific. We can always make it handle other stages again later if we want. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/compiler: Add a helper for growing the prog_data::param arrayJason Ekstrand2017-10-121-0/+13
| | | | | Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/compiler: Stop adding params for texture sizesJason Ekstrand2017-10-122-6/+0
| | | | | | | | We haven't needed this ever since we started using NIR for lowering rectangle textures. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Only add the wpos state reference if we lowered somethingJason Ekstrand2017-10-121-6/+6
| | | | | | | | | | | | Otherwise, in the ARB program case _mesa_add_state_reference may grow the parameter array which will cause brw_nir_setup_arb_uniforms to write past the end of the param array because it only looks at the parameter list length but the parma array is allocated based on nir->num_uniforms. The only reason this hasn't caused us problems is because we are padding out the param array for fragment programs unnecessarily. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/compiler: Add a flag for pull constant supportJason Ekstrand2017-10-125-2/+13
| | | | | | | | | | | The Vulkan driver does not support pull constants. It simply limits things such that we can always push everything. Previously, we were determining whether or not to push things based on whether or not the prog_data::pull_param array is non-null. This is rather hackish and about to stop working. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* anv/pipeline: Ralloc prog_data::param of the compile mem_ctxJason Ekstrand2017-10-121-2/+1
| | | | | | | | | This way we stop leaking it. This is completely safe because, when we hand it off to anv_shader_bin_create or anv_pipeline_cache_upload_kernel, they make a copy of the entire param array. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* anv/pipeline: Add a mem_ctx parameter to anv_pipeline_compileJason Ekstrand2017-10-121-33/+39
| | | | | | | | This lets us avoid some of the manual ralloc stealing and prepares for future commits in which we will want to ralloc prog_data::param. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Store image_param in brw_context instead of prog_dataJason Ekstrand2017-10-1214-49/+17
| | | | | | | | | | This burns an extra 10k of memory or so in the case where you don't have any images. However, if you have several shaders which use images, this should be much less memory. It also gets rid of a part of prog_data that really has nothing to do with the compiler. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Use prog->info.num_images for needs_dc computationJason Ekstrand2017-10-121-2/+3
| | | | | | | | This should be just as good as looking in prog_data but removes our one state setup dependency on brw_stage_prog_data::nr_image_param. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel: Rewrite the world of push/pull paramsJason Ekstrand2017-10-1223-151/+288
| | | | | | | | | | | | | | | | | This moves us away to the array of pointers model and onto a model where each param is represented by a generic uint32_t handle. We reserve 2^16 of these handles for builtins that get generated by somewhere inside the compiler and have well-defined meanings. Generic params have handles whose meanings are defined by the driver. The primary downside to this new approach is that it moves a little bit of the work that we would normally do at compile time to draw time. On my laptop this hurts OglBatch6 by no more than 1% and doesn't seem to have any measurable affect on OglBatch7. So, while this may come back to bite us, it doesn't look too bad. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Get rid of gen7_cs_state.cJason Ekstrand2017-10-126-177/+145
| | | | | | | | | The only thing it was handling was push constants. We pull the actual constant upload code into gen6_constant_state.c and the atoms into genX_state_upload.c. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Add a helper for populating constant buffersJason Ekstrand2017-10-123-12/+33
| | | | | Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Move brw_upload_pull_constants to gen6_constant_state.cJason Ekstrand2017-10-123-64/+65
| | | | | Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Get rid of the variable on vote intrinsicsJason Ekstrand2017-10-122-5/+3
| | | | | | | | This looks like a copy+paste error. They don't actually write into that variable as would be implied by putting the return there. Reviewed-by: Lionel Landwerlin <[email protected]> Cc: [email protected]
* nir/opcodes: Fix constant-folding of ufind_msbJason Ekstrand2017-10-121-1/+1
| | | | | | | | | We didn't fold correctly in the case of 0x1 because we never let the loop counter hit 0. Switching it to bit >= 0 solves this problem. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Connor Abbott <[email protected]> Cc: [email protected]
* meta: Delete the PBO texsubimage path for realJason Ekstrand2017-10-122-496/+0
| | | | Acked-by: Kenneth Graunke <[email protected]>
* anv/pipeline_cache: Rework to use multialloc and blobJason Ekstrand2017-10-122-159/+141
| | | | | | | | | This gets rid of all of our hand-rolled size calculation and serialization code and replaces it with safe "standards" that are used elsewhere in anv and mesa. This should be significantly safer than rolling our own. Reviewed-by: Jordan Justen <[email protected]>
* anv/pipeline: Declare bind maps closer to their useJason Ekstrand2017-10-121-12/+6
| | | | | | This is just a trivial cleanup. Reviewed-by: Jordan Justen <[email protected]>
* anv/multialloc: Add new add_size helperJason Ekstrand2017-10-121-2/+4
| | | | Reviewed-by: Jordan Justen <[email protected]>
* compiler/blob: Make some parameters void instead of uint8_tJason Ekstrand2017-10-122-5/+5
| | | | | | | | | There are certain advantages to using uint8_t internally such as well-defined arithmetic on all platforms. However, interfaces that work in terms of raw data should use a void* type. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* compiler/blob: Constify the readerJason Ekstrand2017-10-123-11/+11
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* compiler/blob: Add (reserve|overwrite)_(uint32|intptr) helpersJason Ekstrand2017-10-122-2/+61
| | | | | | | | These helpers not only call blob_reserve_bytes but also make sure that the blob is properly aligned as if blob_write_* were called. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* compiler/blob: make blob_reserve_bytes() more usefulConnor Abbott2017-10-123-20/+12
| | | | | | | | | | | | | | | | Despite the name, it could only be used if you immediately wrote to the pointer. Noboby was using it outside of one test, so clearly this behavior wasn't that useful. Instead, make it return an offset into the data buffer so that the result isn't invalidated if you later write to the blob. In conjunction with blob_overwrite_bytes(), this will be useful for leaving a placeholder and then filling it in later, which we'll need to do for handling phi nodes when serializing NIR. v2 (Jason Ekstrand): - Detect overflow in the offset + to_write computation Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* compiler/blob: Allow for fixed-size blobs with a NULL data pointerJason Ekstrand2017-10-122-3/+10
| | | | | | | | These can be used to easily count up the number of bytes that will be required by "writing" it into the NULL blob. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* compiler/blob: Add a concept of a fixed-allocation blobJason Ekstrand2017-10-122-1/+37
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* compiler/blob: Switch to init/finish instead of create/destroyJason Ekstrand2017-10-125-87/+80
| | | | | | | | | There's no reason why that tiny bit of memory needs to be on the heap. We always put blob_reader on the stack, so why not do the same with the writable blob. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* compiler: Move blob up a levelJason Ekstrand2017-10-126-6/+6
| | | | | | | | We're going to want to use the blob for Vulkan pipeline caching so it makes sense to have it in libcompiler not libglsl. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* meson: Add inc_compiler to the libglsl includesJason Ekstrand2017-10-121-1/+1
|
* glsl/blob: Return false from grow_to_fit if we've ever failedJason Ekstrand2017-10-122-1/+13
| | | | | | | | | | | | Otherwise we could have a failure followed by a smaller write that succeeds and get a corrupted blob. If we ever OOM, we should stop. v2 (Jason Ekstrand): - Initialize the new boolean member in create_blob Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Cc: [email protected]
* glsl/blob: Return false from ensure_can_read on overrunJason Ekstrand2017-10-121-0/+3
| | | | | | | | | Otherwise, if you have a large read fail and then try to do a small read, the small read may succeed even though it's at the wrong offset. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Cc: [email protected]
* i965: Share the flush for brw_blorp_miptree_download into a pboChris Wilson2017-10-123-31/+24
| | | | | | | | | | | | | As all users of brw_blorp_miptree_download() must emit a full pipeline and cache flush when targetting a user PBO (as that PBO may then be subsequently bound or *be* bound anywhere and outside of the driver dirty tracking) move that flush into brw_blorp_miptree_download() itself. v2 (Ken): Rebase without userptr stuff so it can land sooner. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* meta: Delete the PBO texture upload/download pathJason Ekstrand2017-10-124-97/+0
| | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Use blorp instead of meta for PBO pixel readsJason Ekstrand2017-10-121-9/+51
| | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Use blorp instead of meta for PBO texture downloadsJason Ekstrand2017-10-121-4/+29
| | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/tex: Use blorp texture upload for all CCS_E texturesJason Ekstrand2017-10-121-1/+2
| | | | | | | | | | | This improves the FillTex benchmark in GLBench 2.7 by 30% on my Broxton. On Ken's Broxton which only has single-channel ram, it improves by 210%. v2 (Ken): Check mt->aux_usage == ISL_AUX_USAGE_CCS_E rather than using intel_miptree_is_lossless_compressed(). Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Use blorp instead of meta for PBO texture uploadsJason Ekstrand2017-10-121-4/+30
| | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Add blorp-based texture upload and download pathsJason Ekstrand2017-10-122-0/+362
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | v1 (Topi Pohjolainen): original patch. v2 (Topi Pohjolainen): - Fix return value (s/MESA_FORMAT_NONE/false/) (Anuj) - Move _mesa_tex_format_from_format_and_type() just in the end avoiding additional if-block (Anuj) - Explain better the array alignment restriction (Anuj) - Do not bail out in case of gl_pixelstore_attrib::ImageHeight, it is handled by _mesa_image_offset() automatically (Ken). - Support 1D_ARRAY by flipping depth, width and y, z (Ken). v3 (Topi Pohjolainen): - Contrary to v2, do not try to handle gl_pixelstore_attrib::ImageHeight. Currently there are no tests in piglit or cts for it. One could possibly copy or modify tests/texturing/texsubimage.c. There, however, seems to be number of corner cases to consider. Moreover, current meta path applies the packing height for both source and targets when determining the offset. This would probably require re-visiting also. v4 (Topi Pohjolainen): Rebased on top of merged drm-bacon v5 (Jason Ekstrand): - Move to brw_blorp.c - Significant refactoring - Fixed 1-D array textures - Simplified handling of PBOs vs. CPU data. - Handle gl_pixelstore_attrib::ImageHeight. It turns out there are piglit tests that cover this. The original version was failing them because of an error in the way it handled 1-D array textures. - Add support for texture download v6 (Kenneth Graunke): Rebase fixes: - Use intel_miptree_check_level_layer instead of deleted fields - Update for mesa_format_supports_render[] rename. - Pass 'false' (read-only) to intel_bufferobj_buffer v7 (Kenneth Graunke): - Fix brw_blorp_download_miptree to pass 'false' (not read only) for the destination buffer (caught by Chris Wilson). - Fix blorp_get_client_bo to pass intel_bufferobj_buffer !read_only for the 'writable' parameter instead of 'false' (caught by Jason). - Support GL_BGR, GL_BGRA, GL_BGRA_INTEGER, GL_BGR_INTEGER, allowing us to use this for ReadPixels on the window system buffer (caught by Chris Wilson). - Fix y-flipping bugs in download path (exposed by BGRA support). - Fix false vs. NULL return value in blorp_get_client_bo. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Refactor y-flipping coordinate transform.Kenneth Graunke2017-10-121-7/+11
| | | | | | I want to reuse it for the BLORP download path. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/tex: Check if there is data to upload up-frontJason Ekstrand2017-10-121-0/+4
| | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/barrier: Do the correct flushes for framebuffer accessJason Ekstrand2017-10-121-1/+1
| | | | | | | | | Framebuffer access includes framebuffer reads so we need to invalidate the texture cache. We do not, however, need to flush the depth cache because you cannot do bind a depth texture as an image. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/barrier: Do the correct flushes for texture updatesJason Ekstrand2017-10-121-2/+4
| | | | | | | | | Texture uploads and downloads may go through the render pipe which may result in texturing from or rendering to the texture or the PBO. We need to flush accordingly. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* include: Revert out the update of the Khronos GLX extension header.Eric Anholt2017-10-121-11/+1
| | | | | | | | They made a mistake in the MESA_swap_control XML, which I'm pursuing in their github. Until then, we can just back this piece out. Tested-by: Mark Janes <[email protected]> Reviewed-by: Mark Janes <[email protected]>
* i965: Ignore GL_SKIP_DECODE_EXT for textures accessed via texelFetch().Kenneth Graunke2017-10-121-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The GL_EXT_texture_sRGB_decode spec says: "The conversion of sRGB color space components to linear color space is always performed if the texel lookup function is one of the texelFetch builtin functions. Otherwise, if the texel lookup function is one of the texture builtin functions or one of the texture gather functions, the conversion of sRGB color space components to linear color space is controlled by the TEXTURE_SRGB_DECODE_EXT parameter. If the TEXTURE_SRGB_DECODE_EXT parameter is DECODE_EXT, the conversion of sRGB color space components to linear color space is performed. If the TEXTURE_SRGB_DECODE_EXT parameter is SKIP_DECODE_EXT, the value is returned without decoding. However, if the texture is also accessed with a texelFetch function, then the result of texture builtin functions and/or texture gather functions may be returned with decoding or without decoding." This patch makes i965 force sRGB decoding for any textures accessed via texelFetch(). If textures are accessed via texelFetch() and a regular texture access function, this will affect the other ones too - which is fine - it's undefined according to the last paragraph quoted. We could make both work, but we'd have to emit multiple SURFACE_STATEs, and have two binding table sections, like we do for texture gather hacks on older platforms. Fixes the following Android O CTS test: dEQP-GLES31.functional.srgb_texture_decode.skip_decode.srgba8.texel_fetch Reviewed-by: Jason Ekstrand <[email protected]>