AMD Patent Grants

Skew matching in a die-to-die interface

Granted: April 16, 2024
Patent Number: 11960435
A semiconductor package for skew matching in a die-to-die interface, including: a first die; a second die aligned with the first die such that each connection point of a first plurality of connection points of the first die is substantially equidistant to a corresponding connection point of a second plurality of connection points of the second die; and a plurality of connection paths of a substantially same length, wherein each connection path of the plurality of connection paths couples…

Method and apparatus for reducing the latency of long latency memory requests

Granted: April 16, 2024
Patent Number: 11960404
Systems, apparatuses, and methods for efficiently processing memory requests are disclosed. A computing system includes at least one processing unit coupled to a memory. Circuitry in the processing unit determines a memory request becomes a long-latency request based on detecting a translation lookaside buffer (TLB) miss, a branch misprediction, a memory dependence misprediction, or a precise exception has occurred. The circuitry marks the memory request as a long-latency request such as…

Relaxed invalidation for cache coherence

Granted: April 16, 2024
Patent Number: 11960399
Methods, systems, and devices maintain state information in a shadow tag memory for a plurality of cachelines in each of a plurality of private caches, with each of the private caches being associated with a corresponding one of multiple processing cores. One or more cache probes are generated based on a write operation associated with one or more cachelines of the plurality of cachelines, such that each of the cache probes is associated with cachelines of a particular private cache of…

Performance management during power supply voltage droop

Granted: April 16, 2024
Patent Number: 11960340
A method for controlling a data processing system includes detecting a droop in a power supply voltage of a functional circuit of the data processing system greater than a programmable droop threshold. An operation of the data processing system is throttled according to a programmable step size, a programmable assertion time, and a programmable de-assertion time in response to detecting the droop.

Multi-die stacked power delivery

Granted: April 16, 2024
Patent Number: 11960339
A multi-die processor semiconductor package includes a first base integrated circuit (IC) die configured to provide, based at least in part on an indication of a configuration of a first plurality of compute dies 3D stacked on top of the first base IC die, a unique power domain to each of the first plurality of compute dies. In some embodiments, the semiconductor package also includes a second base IC die including a second plurality of compute dies 3D stacked on top of the second base…

Hybrid render with preferred primitive batch binning and sorting

Granted: April 9, 2024
Patent Number: 11954782
A method, system, and non-transitory computer readable storage medium for rasterizing primitives are disclosed. The method, system, and non-transitory computer readable storage medium includes: generating a primitive batch from a sequence of one or more primitives, wherein the primitive batch includes primitives sorted into one or more row groups based on which row of a plurality of rows each primitive intersects; and processing each row group, the processing for each row group…

Enhanced method for a useful blockchain consensus

Granted: April 9, 2024
Patent Number: 11956368
An approach is provided for implementing a useful proof-of-work consensus algorithm. A proposed block is received. A combined hash value is generated based on the proposed block and a nonce value. The combined hash value is divided into a plurality of hash value pieces that each correspond to a work packet of a plurality of work packets. One or more requests are transmitted for the plurality of work packets that correspond to the plurality of hash value pieces. In response to receiving…

Semiconductor chip having stepped conductive pillars

Granted: April 9, 2024
Patent Number: 11955447
In an implementation, a semiconductor chip includes a device layer, an interconnect layer fabricated on the device layer, the interconnect layer including a conductive pad, and a conductive pillar coupled to the conductive pad. The conductive pillar includes at least a first portion having a first width and a second portion having a second width, the first portion being disposed between the second portion and the conductive pad, wherein the first width of the first portion is greater…

Detecting voice regions in a non-stationary noisy environment

Granted: April 9, 2024
Patent Number: 11955138
Methods, devices, and systems for voice activity detection. An audio signal is received by receiver circuitry. A pitch analysis is performed on the received audio signal by pitch analysis circuitry. A higher-order statistics analysis is performed on the audio signal by statistics analysis circuitry. Logic circuitry determines, based on the pitch analysis and the higher-order statistics analysis, whether the audio signal includes a voice region. The logic circuitry outputs a signal…

Variable width bounding volume hierarchy nodes

Granted: April 9, 2024
Patent Number: 11954788
A technique for performing ray tracing operations is provided. The technique includes processing small bounding box nodes in a box intersection test circuit to generate intersection test results for the small bounding box nodes; and processing large bounding box nodes in the box intersection test circuit to generate intersection test results for the large bounding box nodes.

Method and apparatus for implementing a rasterizer in GPU operations

Granted: April 9, 2024
Patent Number: 11954757
An apparatus, such as a graphical processing unit (GPU), includes one or more processors configured to determine a plurality of first locality information of a received wave at a processing unit and to select a first processing element of a plurality of processing elements, the first processing unit having a plurality of second locality information from a previous wave that matches the plurality of first locality information to execute the received wave.

Prefetch kernels on data-parallel processors

Granted: April 9, 2024
Patent Number: 11954036
Embodiments include methods, systems and non-transitory computer-readable computer readable media including instructions for executing a prefetch kernel that includes memory accesses for prefetching data for a processing kernel into a memory, and, subsequent to executing at least a portion of the prefetch kernel, executing the processing kernel where the processing kernel includes accesses to data that is stored into the memory resulting from execution of the prefetch kernel.

Page rinsing scheme to keep a directory page in an exclusive state in a single complex

Granted: April 9, 2024
Patent Number: 11954033
A method includes, in a cache directory, storing an entry associating a memory region with an exclusive coherency state, and in response to a memory access directed to the memory region, transmitting a demote superprobe to convert at least one cache line of the memory region from an exclusive coherency state to a shared coherency state.

Paging hierarchies for extended page tables and extended page attributes

Granted: April 9, 2024
Patent Number: 11954026
A processing system includes a processor core for processing instructions and a memory that stores a page table set including an extended page table having an extended page table entry storing extended page table attributes associated with a physical memory page. The system receives a virtual address and translates the virtual address to a physical address for the physical memory page. One or more extended page attributes associated with the physical memory page are retrieved from the…

Separate clocking for components of a graphics processing unit

Granted: April 2, 2024
Patent Number: 11947380
Systems and methods related to controlling clock signals for clocking shader engines modules (SEs) and non-shader-engine modules (nSEs) of a graphics processing unit (GPU) are provided. One or more dividers receive a clock signal CLK and output a clock signal CLKA to the SEs and output a clock signal CLKB to the nSEs. The frequencies of CLKA and CLKB are independently selected based on sets of performance counter data monitored at the SEs and nSEs, respectively. The clock signal…

Enabling accelerated processing units to perform dataflow execution

Granted: April 2, 2024
Patent Number: 11947487
Methods and systems are disclosed for performing dataflow execution by an accelerated processing unit (APU). Techniques disclosed include decoding information from one or more dataflow instructions. The decoded information is associated with dataflow execution of a computational task. Techniques disclosed further include configuring, based on the decoded information, dataflow circuitry, and, then, executing the dataflow execution of the computational task using the dataflow circuitry.

Cross-chiplet performance data streaming

Granted: April 2, 2024
Patent Number: 11947476
Methods and systems are disclosed for cross-chiplet performance data streaming. Techniques disclosed include accumulating, by a subservient chiplet, event data associated with an event indicative of a performance aspect of the subservient chiplet; sending, by the subservient chiplet, the event data over a chiplet bus to a master chiplet; and adding, by the master chiplet, the received event data to an event record, the event record containing previously received, from the subservient…

Duplicated registers in chiplet processing units

Granted: April 2, 2024
Patent Number: 11947473
Systems, apparatuses, and methods for implementing duplicated registers for access by initiators across multiple semiconductor dies are disclosed. A system includes multiple initiators on multiple semiconductor dies of a chiplet processor. One of the semiconductor dies is the master die, and this master die has copies of registers which can be accessed by the multiple initiators on the multiple semiconductor dies. When a given initiator on a given secondary die generates a register…

Weak cache line invalidation requests for speculatively executing instructions

Granted: April 2, 2024
Patent Number: 11947456
Techniques for invalidating cache lines are provided. The techniques include issuing, to a first level of a memory hierarchy, a weak exclusive read request for a speculatively executing store instruction; determining whether to invalidate one or more cache lines associated with the store instruction in one or more memories; and issuing the weak invalidation request to additional levels of the memory hierarchy.

Suppressing cache line modification

Granted: April 2, 2024
Patent Number: 11947455
Disclosed is a system and method for use in a cache for suppressing modification of cache line. The system and method includes a processor and a memory operating cooperatively with a cache controller. The memory includes a coherence directory stored within a cache created to track at least one cache line in the cache via the cache controller. The processor instructs a cache controller to store a first data in a cache line in the cache. The cache controller tags the cache line based on…