AMD Patent Grants

Graded throttling for network-on-chip traffic

Granted: January 16, 2024
Patent Number: 11876718
Graded throttling for network-on-chip traffic, including: calculating, by an agent of a network-on-chip, a number of outstanding transactions issued by the agent; determining that the number of outstanding transactions meets a threshold; and implementing, by the agent, in response to the number of outstanding transactions meeting the threshold, a traffic throttling policy.

Variable tick for DRAM interface calibration

Granted: January 16, 2024
Patent Number: 11875875
Methods and systems are disclosed for calibrating, by a memory interface system, an interface with dynamic random-access memory (DRAM) using a dynamically changing training clock. Techniques disclosed comprise receiving a system clock having a clock signal at a first pulse rate. Then, during the training of the interface, techniques disclosed comprise generating a training clock from the clock signal at the first pulse rate, the training clock having a clock signal at a second pulse…

Implementing heterogeneous wavefronts on a graphics processing unit (GPU)

Granted: January 16, 2024
Patent Number: 11875425
Implementing heterogeneous wavefronts on a graphics processing unit (GPU) is disclosed. A scheduler assigns heterogeneous wavefronts for execution on a compute unit of a processing device. The heterogeneous wavefronts include different types of wavefronts such as vector compute wavefronts and service-level wavefronts that vary in resource requirements and instruction sets. As one example, heterogeneous wavefronts may include scalar wavefronts and vector compute wavefronts that execute on…

Coherent block read fulfillment

Granted: January 16, 2024
Patent Number: 11874783
A coherent memory fabric includes a plurality of coherent master controllers and a coherent slave controller. The plurality of coherent master controllers each include a response data buffer. The coherent slave controller is coupled to the plurality of coherent master controllers. The coherent slave controller, responsive to determining a selected coherent block read command is guaranteed to have only one data response, sends a target request globally ordered message to the selected…

Processor-guided execution of offloaded instructions using fixed function operations

Granted: January 9, 2024
Patent Number: 11868777
Processor-guided execution of offloaded instructions using fixed function operations is disclosed. Instructions designated for remote execution by a target device are received by a processor. Each instruction includes, as an operand, a target register in the target device. The target register may be an architected virtual register. For each of the plurality of instructions, the processor transmits an offload request in the order that the instructions are received. The offload request…

Stacked die circuit routing system and method

Granted: January 9, 2024
Patent Number: 11869874
A stacked die system includes at least three dies. A first die has a same design as a second die. The first die includes a first circuit, and the second die includes a corresponding second circuit. A signal is received at the first die and sent to the third die via the second die. The signal is routed through either the first circuit or the second circuit but not both. Accordingly, an operation is performed on the signal prior to the signal reaching the third die but the operation is not…

Combined world-space pipeline shader stages

Granted: January 9, 2024
Patent Number: 11869140
Improvements to graphics processing pipelines are disclosed. More specifically, the vertex shader stage, which performs vertex transformations, and the hull or geometry shader stages, are combined. If tessellation is disabled and geometry shading is enabled, then the graphics processing pipeline includes a combined vertex and graphics shader stage. If tessellation is enabled, then the graphics processing pipeline includes a combined vertex and hull shader stage. If tessellation and…

Lock address contention predictor

Granted: January 9, 2024
Patent Number: 11868818
Techniques for selectively executing a lock instruction speculatively or non-speculatively based on lock address prediction and/or temporal lock prediction. including methods an devices for locking an entry in a memory device. In some techniques, a lock instruction executed by a thread for a particular memory entry of a memory device is detected. Whether contention occurred for the particular memory entry during an earlier speculative lock is detected on a condition that the lock…

Hardware assisted fine-grained data movement

Granted: January 9, 2024
Patent Number: 11868809
A processor includes a task scheduling unit and a compute unit coupled to the task scheduling unit. The task scheduling unit performs a task dependency assessment of a task dependency graph and task data requirements that correspond to each task of the plurality of tasks. Based on the task dependency assessment, the task scheduling unit schedules a first task of the plurality of tasks and a second proxy object of a plurality of proxy objects specified by the task data requirements such…

Compacted addressing for transaction layer packets

Granted: January 9, 2024
Patent Number: 11868778
Compacted addressing for transaction layer packets, including: determining, for a first epoch, one or more low entropy address bits in a plurality of first transaction layer packets; removing, from one or more memory addresses of one or more second transaction layer packets, the one or more low entropy address bits; and sending the one or more second transaction layer packets.

Shader source code performance prediction

Granted: January 9, 2024
Patent Number: 11868759
Shader source code performance prediction is described. In accordance with the described techniques, an update to shader source code for implementing a shader is received. A prediction of performance of the shader on a processing unit is generated based on the update to the shader source code. Feedback about the update is output. The feedback includes the prediction of performance of the shader. In one or more implementations, generating the prediction of performance of the shader…

Processing-in-memory concurrent processing system and method

Granted: January 9, 2024
Patent Number: 11868306
A processing system includes a processing unit and a memory device. The memory device includes a processing-in-memory (PIM) module that performs processing operations on behalf of the processing unit. An instruction set architecture (ISA) of the PIM module has fewer instructions than an ISA of the processing unit. Instructions received from the processing unit are translated such that processing resources of the PIM module are virtualized. As a result, the PIM module concurrently…

Using epoch counter values for controlling the retention of cache blocks in a cache

Granted: January 9, 2024
Patent Number: 11868254
An electronic device includes a cache, a memory, and a controller. The controller stores an epoch counter value in metadata for a location in the memory when a cache block evicted from the cache is stored in the location. The controller also controls how the cache block is retained in the cache based at least in part on the epoch counter value when the cache block is subsequently retrieved from the location and stored in the cache.

Multi-adaptive cache replacement policy

Granted: January 9, 2024
Patent Number: 11868221
Techniques for performing cache operations are provided. The techniques include tracking performance events for a plurality of test sets of a cache, detecting a replacement policy change trigger event associated with a test set of the plurality of test sets, and in response to the replacement policy change trigger event, operating non-test sets of the cache according to a replacement policy associated with the test set.

Automatic in-game subtitles and closed captions

Granted: January 2, 2024
Patent Number: 11857877
An approach is provided for a gaming overlay application to provide automatic in-game subtitles and/or closed captions for video game applications. The overlay application accesses an audio stream and a video stream generated by an executing game application. The overlay application processes the audio stream through a text conversion engine to generate at least one subtitle. The overlay application determines a display position to associate with the at least one subtitle. The overlay…

Peripheral device protocols in confidential compute architectures

Granted: January 2, 2024
Patent Number: 11860797
Restricting peripheral device protocols in confidential compute architectures, the method including: receiving a first address translation request from a peripheral device supporting a first protocol, wherein the first protocol supports cache coherency between the peripheral device and a processor cache; determining that a confidential compute architecture is enabled; and providing, in response to the first address translation request, a response including an indication to the peripheral…

Cache miss predictor

Granted: January 2, 2024
Patent Number: 11860787
Methods, devices, and systems for retrieving information based on cache miss prediction. A prediction that a cache lookup for the information will miss a cache is made based on a history table. The cache lookup for the information is performed based on the request. A main memory fetch for the information is begun before the cache lookup completes, based on the prediction that the cache lookup for the information will miss the cache. In some implementations, the prediction includes…

Live profile-driven cache aging policies

Granted: January 2, 2024
Patent Number: 11860784
A technique for operating a cache is disclosed. The technique includes recording access data for a first set of memory accesses of a first frame; identifying parameters for a second set of memory accesses of a second frame subsequent to the first frame, based on the access data; and applying the parameters to the second set of memory accesses.

Hardware assisted memory profiling aggregator

Granted: January 2, 2024
Patent Number: 11860755
An approach is provided for implementing memory profiling aggregation. A hardware aggregator provides memory profiling aggregation by controlling the execution of a plurality of hardware profilers that monitor memory performance in a system. For each hardware profiler of the plurality of hardware profilers, a hardware counter value is compared to a threshold value. When a threshold value is satisfied, execution of a respective hardware profiler of the plurality of hardware profilers is…

Clock frequency divider circuit

Granted: January 2, 2024
Patent Number: 11860685
A system and method for efficiently generating clock signals are described. In various implementations, an integrated circuit includes multiple clock frequency dividers both at its I/O boundaries and across its die. A clock frequency divider utilizes a first clock divider and a second clock divider that receive input clock signals with an initial phase difference between them. The first clock divider and the second clock divider generate output clock signals that have frequencies that…