AMD Patent Applications

REDUNDANCY METHOD AND APPARATUS FOR SHADER COLUMN REPAIR

Granted: October 27, 2022
Application Number: 20220343456
Methods and systems are described. A system includes a redundant shader pipe array that performs rendering calculations on data provided thereto and a shader pipe array that includes a plurality of shader pipes, each of which performs rendering calculations on data provided thereto. The system also includes a circuit that identifies a defective shader pipe of the plurality of shader pipes in the shader pipe array. In response to identifying the defective shader pipe, the circuit…

KERNEL SIZE INDEPENDENT POOLING OPERATIONS

Granted: October 13, 2022
Application Number: 20220327353
Devices, methods, and systems for determining N-dimensional MaxPool or AvgPool for a M-dimensional input array. For each of N dimensions, in order from highest to lowest dimension i: the M dimensional input array is decomposed into 1 dimensional (1D) input arrays in the ith dimension, 1D MaxPool or AvgPool is performed on each of the 1D input arrays in the ith dimension to generate 1D output arrays in the ith dimension, and the M dimensional input array is recomposed from the 1D output…

DRAM COMMAND STREAK EFFICIENCY MANAGEMENT

Granted: October 6, 2022
Application Number: 20220317928
A memory controller includes a command queue and an arbiter for selecting entries from the command queue for transmission to a DRAM. The arbiter transacts streaks of consecutive read commands and streaks of consecutive write commands. The arbiter transacts a streak for at least a minimum burst length based on a number of commands of a designated type available to be selected by the arbiter. Following the minimum burst length, the arbiter decides to start a new streak of commands of a…

POST-DEPTH VISIBILITY COLLECTION WITH TWO LEVEL BINNING

Granted: October 6, 2022
Application Number: 20220319091
A method and apparatus of tile rendering of an image for a display in a computer system includes receiving the image in a graphics pipeline of the computer system, the image comprising one or more three dimensional (3D) objects. The image is divided into one or more tiles. A depth test is performed on the one or more tiles, and based upon the depth test, visibility information of the one or more tiles is binned.

MULTI-ACCELERATOR COMPUTE DISPATCH

Granted: October 6, 2022
Application Number: 20220319089
Techniques for executing computing work by a plurality of chiplets are provided. The techniques include assigning workgroups of a kernel dispatch packet to the chiplets; by each chiplet, executing the workgroups assigned to that chiplet; for each chiplet, upon completion of all workgroups assigned to that chiplet for the kernel dispatch packet, notifying the other chiplets of such completion; and upon completion of all workgroups of the kernel dispatch packet, notifying a client of such…

REAL TIME MACHINE LEARNING-BASED PRIVACY FILTER FOR REMOVING REFLECTIVE FEATURES FROM IMAGES AND VIDEO

Granted: October 6, 2022
Application Number: 20220318954
A method for removing reflections from images is disclosed. The method includes identifying one or more segments of an image, the one or more segments including a reflection; identifying one or more features of the one or more segments; removing the one or more features from the segments to generate one or more sanitized segments; and combining the one or more sanitized segments with the image to generate a sanitized image.

MEMORY CONTROLLER POWER STATES

Granted: October 6, 2022
Application Number: 20220318161
A memory controller includes a command queue and an arbiter operating in a first voltage domain, and a physical layer interface (PHY) operating in a second voltage domain. The memory controller includes isolation cells operable to isolate the PHY from the first voltage domain. A local power state controller, in response to a first power state command, provides configuration and state data for storage in an on-chip RAM memory, causes the memory controller to enter a powered-down state,…

METHOD AND APPARATUS FOR A DRAM CACHE TAG PREFETCHER

Granted: October 6, 2022
Application Number: 20220318151
Devices and methods for cache prefetching are provided. A device is provided which comprises memory and a processor. The memory comprises a DRAM cache, a cache dedicated to the processor and one or more intermediate caches between the dedicated cache and the DRAM cache. The processor is configured to issue prefetch requests to prefetch data, issue data access requests to fetch the data and when one or more previously issued prefetch requests are determined to be inaccurate, issue a…

WAVEFRONT SELECTION AND EXECUTION

Granted: October 6, 2022
Application Number: 20220318021
Techniques are provided for executing wavefronts. The techniques include at a first time for issuing instructions for execution, performing first identifying, including identifying that sufficient processing resources exist to execute a first set of instructions together within a processing lane; in response to the first identifying, executing the first set of instructions together; at a second time for issuing instructions for execution, performing second identifying, including…

EFFICIENT AND LOW LATENCY MEMORY ACCESS SCHEDULING

Granted: October 6, 2022
Application Number: 20220317934
A memory controller includes a command queue that receives and stores decoded memory commands and information related thereto including information indicating a type, a priority, an age, and a region of a memory system for a corresponding decoded memory command, and an arbiter coupled to the command queue and picks selected decoded memory commands among the decoded memory commands from the command queue for dispatch to the memory system by comparing the priority and the age for decoded…

ADAPTIVE MEMORY CONSISTENCY IN DISAGGREGATED DATACENTERS

Granted: October 6, 2022
Application Number: 20220317927
A data processor includes a fabric-attached memory (FAM) interface for coupling to a data fabric and fulfilling memory access instructions. A requestor-side adaptive consistency controller coupled to the FAM interface requests notifications from a fabric manager for the fabric-attached memory regarding changes in requestors authorized to access a FAM region which the data processor is authorized to access. If a notification indicates that more than one requestor is authorized to access…

EFFICIENT AND LOW LATENCY MEMORY ACCESS SCHEDULING

Granted: October 6, 2022
Application Number: 20220317924
A memory controller includes a command queue that receives and stores decoded memory commands and information related thereto including information indicating a type, a priority, an age, and a region of a memory system for a corresponding decoded memory command, and an arbiter coupled to the command queue and picks selected decoded memory commands among the decoded memory commands from the command queue for dispatch to the memory system by comparing the priority and the age for decoded…

WRITE BANK GROUP MASK DURING ARBITRATION

Granted: October 6, 2022
Application Number: 20220317923
A memory controller includes an arbiter for selecting memory requests from a command queue for transmission to a DRAM memory. The arbiter includes a bank group tracking circuit that tracks bank group numbers of three or more prior write requests selected by the arbiter. The arbiter also includes a selection circuit that selects requests to be issued from the command queue, and prevents selection of write requests and associated activate commands to the tracked bank group numbers unless…

DATA FABRIC CLOCK SWITCHING

Granted: October 6, 2022
Application Number: 20220317755
A memory controller couples to a data fabric clock domain, and to a physical layer interface circuit PHY clock domain. A first interface circuit adapts transfers between the data fabric clock domain (FCLK) and the memory controllers clock domain, and a second interface circuit couples the memory controller to the PHY clock domain. A power controller responds to a power state change request by sending commands to the second interface circuit to change parameters of a memory system and to…

SYNCHRONIZATION FREE CROSS PASS BINNING THROUGH SUBPASS INTERLEAVING

Granted: September 29, 2022
Application Number: 20220309729
A method of tiled rendering is provided which comprises dividing a frame to be rendered, into a plurality of tiles, receiving commands to execute a plurality of subpasses of the tiles and interleaving execution of same subpasses of multiple tiles of the frame. Interleaving execution of same subpasses of multiple tiles comprises executing a previously ordered first subpass of a second tile between execution of the previously ordered first subpass of a first tile and execution of a…

DYNAMICALLY RECONFIGURABLE REGISTER FILE

Granted: September 29, 2022
Application Number: 20220309606
Techniques for managing register allocation are provided. The techniques include detecting a first request to allocate first registers for a first wavefront; first determining, based on allocation information, that allocating the first registers to the first wavefront would result in a condition in which a deadlock is possible; in response to the first determining, refraining from allocating the first registers to the first wavefront; detecting a second request to allocate second…

APPROXIMATION OF MATRICES FOR MATRIX MULTIPLY OPERATIONS

Granted: September 29, 2022
Application Number: 20220309126
A processing device is provided which comprises memory configured to store data and a processor configured to receive a portion of data of a first matrix comprising a first plurality of elements and receive a portion of data of a second matrix comprising a second plurality of elements. The processor is also configured to determine values for a third matrix by dropping a number of products from products of pairs of elements of the first and second matrices based on approximating the…

DATA COMPRESSOR FOR APPROXIMATION OF MATRICES FOR MATRIX MULTIPLY OPERATIONS

Granted: September 29, 2022
Application Number: 20220309125
A processing device is provided which comprises memory configured to store data and a processor. The processor comprises a plurality of MACs configured to perform matrix multiplication of elements of a first matrix and elements of a second matrix. The processor also comprises a plurality of logic devices configured to sum values of bits of product exponents values of the elements of the first matrix and second matrix and determine keep bit values for product exponents values to be kept…

OVERLAPPED CURVE MAPPING FOR HISTOGRAM-BASED LOCAL TONE AND LOCAL CONTRAST

Granted: September 15, 2022
Application Number: 20220292653
Methods and apparatuses are disclosed herein for performing tone mapping and/or contrast enhancement. In some examples, a block mapping curve is low-pass filtered with block mapping curves of surrounding blocks to form a smoothed block mapping curve. In some examples, overlapped curve mapping of block mapping curves, including smoothed block mapping curves, is performed, including weighting, based on a pixel location, block mapping curves of a group of blocks to generate an interpolated…

DATA CACHE REGION PREFETCHER

Granted: September 8, 2022
Application Number: 20220283955
A method, system, and processing system for pre-fetching data is disclosed. The method, system, and processing system includes data cache region prefetch circuitry for detecting a first access by a first instruction at a first instruction address to a first memory portion, detecting a first non-sequential access pattern to a set of addresses in the first memory portion, and in response to a miss by a second instruction at the first instruction address, and in response to the…