AMD Patent Applications

Virtually Padding Data Structures

Granted: February 29, 2024
Application Number: 20240069915
A virtual padding unit provides a virtual padded data structure (e.g., virtually padded matrix) that provides output values for a padded data structure without storing all of the padding elements in memory. When the virtual padding unit receives a virtual memory address of a location in the virtual padded data structure, the virtual padding unit checks whether the location is a non-padded location in the virtual padded data structure or a padded location in the virtual padded data…

EFFICIENT RANK SWITCHING IN MULTI-RANK MEMORY CONTROLLER

Granted: February 29, 2024
Application Number: 20240069811
A data processing system includes a memory accessing agent for generating first memory access requests, a first memory system, and a first memory controller. The first memory system includes a first three-dimensional memory stack comprising a first plurality of stacked memory dice, wherein each memory die of the first three-dimensional memory stack includes a different logical rank of a first memory channel. The first memory controller picks second memory access requests from among the…

ADAPTIVE QUANTIZATION FOR NEURAL NETWORKS

Granted: February 15, 2024
Application Number: 20240054332
Methods, devices, systems, and instructions for adaptive quantization in an artificial neural network (ANN) calculate a distribution of ANN information; select a quantization function from a set of quantization functions based on the distribution; apply the quantization function to the ANN information to generate quantized ANN information; load the quantized ANN information into the ANN; and generate an output based on the quantized ANN information. Some examples recalculate the…

Chipset Attached Random Access Memory

Granted: February 15, 2024
Application Number: 20240053891
Random access memory (RAM) is attached to an input/output (I/O) controller of a chipset (e.g., on a motherboard). This chipset attached RAM is optionally used as part of a tiered storage solution with other tiers including, for example, nonvolatile memory (e.g., a solid state drive (SSD)) or a hard disk drive. The chipset attached RAM is separate from the system memory, allowing the chipset attached RAM to be used to speed up access to frequently used data stored in the tiered storage…

FINE-GRAINED CONDITIONAL DISPATCHING

Granted: February 8, 2024
Application Number: 20240045718
Techniques for executing workgroups are provided. The techniques include executing, for a first workgroup of a first kernel dispatch, a workgroup dependency instruction that includes an indication to prioritize execution of a second workgroup of a second kernel dispatch, and in response to the workgroup dependency instruction, dispatching the second workgroup of the second kernel dispatch prior to dispatching a third workgroup of the second kernel dispatch, wherein no workgroup…

METHODS FOR CONSTRUCTING PACKAGE SUBSTRATES WITH HIGH DENSITY

Granted: February 8, 2024
Application Number: 20240047228
A disclosed method can include (i) positioning a first surface of a component of a semiconductor device on a first plated through-hole, (ii) covering, with a layer of dielectric material, at least a second surface of the component that is opposite the first surface of the component, (iii) removing a portion of the layer of dielectric material covering the second surface of the component to form at least one cavity, and (iv) depositing conductive material in the cavity to form a second…

METHODS FOR CONSTRUCTING PACKAGE SUBSTRATES WITH HIGH DENSITY

Granted: February 8, 2024
Application Number: 20240047228
A disclosed method can include (i) positioning a first surface of a component of a semiconductor device on a first plated through-hole, (ii) covering, with a layer of dielectric material, at least a second surface of the component that is opposite the first surface of the component, (iii) removing a portion of the layer of dielectric material covering the second surface of the component to form at least one cavity, and (iv) depositing conductive material in the cavity to form a second…

FINE-GRAINED CONDITIONAL DISPATCHING

Granted: February 8, 2024
Application Number: 20240045718
Techniques for executing workgroups are provided. The techniques include executing, for a first workgroup of a first kernel dispatch, a workgroup dependency instruction that includes an indication to prioritize execution of a second workgroup of a second kernel dispatch, and in response to the workgroup dependency instruction, dispatching the second workgroup of the second kernel dispatch prior to dispatching a third workgroup of the second kernel dispatch, wherein no workgroup…

DYNAMIC PERFORMANCE ADJUSTMENT

Granted: February 1, 2024
Application Number: 20240037031
A technique for operating a device is disclosed. The technique includes recording log data for the device; analyzing the log data to determine one or more performance settings adjustments to apply to the device; and applying the one or more performance settings adjustments to the device.

DYNAMIC RANDOM-ACCESS MEMORY (DRAM) PHASE TRAINING UPDATE

Granted: February 1, 2024
Application Number: 20240036748
A phase training update circuit operates to perform a phase training update on individual bit lanes. The phase training update circuit adjusts a bit lane transmit phase offset forward a designated number of phase steps, transmits a training pattern, and determines a first number of errors in the transmission. It also adjusts the bit lane transmit phase offset backward the designated number of phase steps, transmits the training pattern, and determines a second number of errors in the…

DYNAMIC RANDOM-ACCESS MEMORY (DRAM) PHASE TRAINING UPDATE

Granted: February 1, 2024
Application Number: 20240036748
A phase training update circuit operates to perform a phase training update on individual bit lanes. The phase training update circuit adjusts a bit lane transmit phase offset forward a designated number of phase steps, transmits a training pattern, and determines a first number of errors in the transmission. It also adjusts the bit lane transmit phase offset backward the designated number of phase steps, transmits the training pattern, and determines a second number of errors in the…

DYNAMIC PERFORMANCE ADJUSTMENT

Granted: February 1, 2024
Application Number: 20240037031
A technique for operating a device is disclosed. The technique includes recording log data for the device; analyzing the log data to determine one or more performance settings adjustments to apply to the device; and applying the one or more performance settings adjustments to the device.

MULTI-ACCELERATOR COMPUTE DISPATCH

Granted: January 25, 2024
Application Number: 20240029336
Techniques for executing computing work by a plurality of chiplets are provided. The techniques include assigning workgroups of a kernel dispatch packet to the chiplets; by each chiplet, executing the workgroups assigned to that chiplet; for each chiplet, upon completion of all workgroups assigned to that chiplet for the kernel dispatch packet, notifying the other chiplets of such completion; and upon completion of all workgroups of the kernel dispatch packet, notifying a client of such…

MULTI-ACCELERATOR COMPUTE DISPATCH

Granted: January 25, 2024
Application Number: 20240029336
Techniques for executing computing work by a plurality of chiplets are provided. The techniques include assigning workgroups of a kernel dispatch packet to the chiplets; by each chiplet, executing the workgroups assigned to that chiplet; for each chiplet, upon completion of all workgroups assigned to that chiplet for the kernel dispatch packet, notifying the other chiplets of such completion; and upon completion of all workgroups of the kernel dispatch packet, notifying a client of such…

DISTRIBUTION OF A WORKLOAD AMONG NODES OF A SYSTEM WITH A NUMA ARCHITECTURE

Granted: January 18, 2024
Application Number: 20240020173
Methods and systems are disclosed for distribution of a workload among nodes of a NUMA architecture. Techniques disclosed include receiving the workload and data batches, the data batches to be processed by the workload. Techniques disclosed further include assigning workload processes to the nodes according to a determined distribution, and, then, executing the workload according to the determined distribution. The determined distribution is selected out of a set of distributions, so…

DISTRIBUTION OF A WORKLOAD AMONG NODES OF A SYSTEM WITH A NUMA ARCHITECTURE

Granted: January 18, 2024
Application Number: 20240020173
Methods and systems are disclosed for distribution of a workload among nodes of a NUMA architecture. Techniques disclosed include receiving the workload and data batches, the data batches to be processed by the workload. Techniques disclosed further include assigning workload processes to the nodes according to a determined distribution, and, then, executing the workload according to the determined distribution. The determined distribution is selected out of a set of distributions, so…

ENCODED DATA DEPENDENCY MATRIX FOR POWER EFFICIENCY SCHEDULING

Granted: January 4, 2024
Application Number: 20240004657
The disclosed system may include a processor configured to encode, using an encoding scheme that reduces a number of bits needed to represent one or more instructions from a set of instructions in an instruction buffer represented by a dependency matrix, a dependency indicating that a child instruction represented in the dependency matrix depends on a parent instruction represented in the dependency matrix. The processor may also be configured to store the encoded dependency in the…

CHANNEL AND SUB-CHANNEL THROTTLING FOR MEMORY CONTROLLERS

Granted: January 4, 2024
Application Number: 20240005971
An arbiter is operable to pick commands from a command queue for dispatch to a memory. The arbiter includes a traffic throttle circuit for mitigating excess power usage increases in coordination with one or more additional arbiters. The traffic throttle circuit includes a monitoring circuit and a throttle circuit. The monitoring circuit is for measuring a number of read and write commands picked by the arbiter and the one or more additional arbiters over a first predetermined period of…

SCHEDULING TRAINING OF AN INTER-CHIPLET INTERFACE

Granted: January 4, 2024
Application Number: 20240004815
Systems and methods are disclosed for scheduling a data link training by a controller. The system and method include receiving an indication that a physical layer of a data link is not transferring data and initiating a training process of the physical layer of the data link in response to the indication that the physical layer of the data link is not transferring data. In one aspect, the indication that the physical layer of a data link is not transferring data is an indication that the…

SPLIT REGISTER LIST FOR RENAMING

Granted: January 4, 2024
Application Number: 20240004664
The disclosed system may include a processor configured to detect that a data unit size for an instruction is smaller than a register. The processor may allocate a first portion of the register to the instruction in a manner that leaves a second portion of the register available for allocating to an additional instruction. The processor may also track the register as a split register. Various other methods, systems, and computer-readable media are also disclosed.