AMD Patent Grants

Dynamic hardware selection for experts in mixture-of-experts model

Granted: February 6, 2024
Patent Number: 11893502
A system assigns experts of a mixture-of-experts artificial intelligence model to processing devices in an automated manner. The system includes an orchestrator component that maintains priority data that stores, for each of a set of experts, and for each of a set of execution parameters, ranking information that ranks different processing devices for the particular execution parameter. In one example, for the execution parameter of execution speed, and for a first expert, the priority…

Distributing power shared between an accelerated processing unit and a discrete graphics processing unit

Granted: January 30, 2024
Patent Number: 11886878
An integrated coprocessor such as an accelerated processing unit (APU) generates commands for execution on a discrete coprocessor such as a discrete graphics processing unit (dGPU). Power distribution circuitry selectively provides power to the APU and the dGPU based on characteristics of workloads executing on the APU and the dGPU and based on a platform power limit that is shared by the APU and the dGPU. In some cases, the power distribution circuitry determines a first power provided…

Sharing package pins in a multi-chip module (MCM)

Granted: January 30, 2024
Patent Number: 11886370
A semiconductor package includes multiple dies that share the same package pin. An output enable register provided on each die is used to select the die that drives an output to the shared pin. A hardware arbitration circuit ensures that two or more dies do not drive an output to the shared pin at the same time.

Core selection based on usage policy and core constraints

Granted: January 30, 2024
Patent Number: 11886224
A processing unit of a processing system compiles a priority queue listing of a plurality of processor cores to run a workload based on a cost of running the workload on each of the processor cores. The cost is based on at least one of a system usage policy, characteristics of the workload, and one or more physical constraints of each processor core. The processing unit selects a processor core based on the cost to run the workload and communicates an identifier of the selected processor…

Method and system for opportunistic load balancing in neural networks using metadata

Granted: January 23, 2024
Patent Number: 11880715
Methods and systems for load balancing in a neural network system using metadata are disclosed. Any one or a combination of one or more kernels, one or more neurons, and one or more layers of the neural network system are tagged with metadata. A scheduler detects whether there are neurons that are available to execute. The scheduler uses the metadata to schedule and load balance computations across compute resources and available resources.

High voltage tolerant capacitors

Granted: January 23, 2024
Patent Number: 11881450
A system and method for fabricating on-die metal-insulator-metal capacitors capable of supporting relatively high voltage applications and increasing capacitance per area are described. In various implementations, an integrated circuit includes multiple metal-insulator-metal (MIM) capacitors. The MIM capacitors are formed between two signal nets such as two different power rails, two different control signals, or two different data signals. The integrated circuit includes multiple…

Cross field effect transistor library cell architecture design

Granted: January 23, 2024
Patent Number: 11881393
A system and method for efficiently creating layout for memory bit cells are described. In various implementations, cells of a library use Cross field effect transistors (FETs) that include vertically stacked gate all around (GAA) transistors with conducting channels oriented in an orthogonal direction between them. The channels of the vertically stacked transistors use opposite doping polarities. A first category of cells includes devices where each of the two devices in a particular…

Hybrid render with deferred primitive batch binning

Granted: January 23, 2024
Patent Number: 11880926
A method, computer system, and a non-transitory computer-readable storage medium for performing primitive batch binning are disclosed. The method, computer system, and non-transitory computer-readable storage medium include techniques for generating a primitive batch from a plurality of primitives, computing respective bin intercepts for each of the plurality of primitives in the primitive batch, and shading the primitive batch by iteratively processing each of the respective bin…

Synchronization free cross pass binning through subpass interleaving

Granted: January 23, 2024
Patent Number: 11880924
A method of tiled rendering is provided which comprises dividing a frame to be rendered, into a plurality of tiles, receiving commands to execute a plurality of subpasses of the tiles and interleaving execution of same subpasses of multiple tiles of the frame. Interleaving execution of same subpasses of multiple tiles comprises executing a previously ordered first subpass of a second tile between execution of the previously ordered first subpass of a first tile and execution of a…

Using multiple functional blocks for training neural networks

Granted: January 23, 2024
Patent Number: 11880769
A system is described that performs training operations for a neural network, the system including an analog circuit element functional block with an array of analog circuit elements, and a controller. The controller monitors error values computed using an output from each of one or more initial iterations of a neural network training operation, the one or more initial iterations being performed using neural network data acquired from the memory. When one or more error values are less…

Packed 16 bits instruction pipeline

Granted: January 23, 2024
Patent Number: 11880683
Systems, apparatuses, and methods for efficiently processing arithmetic operations are disclosed. A computing system includes a processor capable of executing single precision mathematical instructions on data sizes of M bits and half precision mathematical instructions on data sizes of N bits, which is less than M bits. At least two source operands with M bits indicated by a received instruction are read from a register file. If the instruction is a packed math instruction, at least a…

Storage location assignment at a cluster compute server

Granted: January 23, 2024
Patent Number: 11880610
A cluster compute server stores different types of data at different storage volumes in order to reduce data duplication at the storage volumes. The storage volumes are categorized into two classes: common storage volumes and dedicated storage volumes, wherein the common storage volumes store data to be accessed and used by multiple compute nodes (or multiple virtual servers) of the cluster compute server. The dedicated storage volumes, in contrast, store data to be accessed only by a…

Data as compute

Granted: January 23, 2024
Patent Number: 11880312
A method includes storing a function representing a set of data elements stored in a backing memory and, in response to a first memory read request for a first data element of the set of data elements, calculating a function result representing the first data element based on the function.

Cache access measurement deskew

Granted: January 23, 2024
Patent Number: 11880310
A processor includes a cache having two or more test regions and a larger non-test region. The processor further includes a cache controller that applies different cache replacement policies to the different test regions of the cache, and a performance monitor that measures performance metrics for the different test regions, such as a cache hit rate at each test region. Based on the performance metrics, the cache controller selects a cache replacement policy for the non-test region, such…

Selecting an error correction code type for a memory device

Granted: January 23, 2024
Patent Number: 11880277
Selecting an error correction code type for a memory device includes: selecting, by the memory device in dependence upon predefined selection criteria, one of a plurality of error correction code types and carrying out memory access requests utilizing the selected error correction code type.

Instruction subset implementation for low power operation

Granted: January 23, 2024
Patent Number: 11880260
A heterogeneous processor system includes a first processor implementing an instruction set architecture (ISA) including a set of ISA features and configured to support a first subset of the set of ISA features. The heterogeneous processor system also includes a second processor implementing the ISA including the set of ISA features and configured to support a second subset of the set of ISA features, wherein the first subset and the second subset of the set of ISA features are different…

Secondary external cooling for mobile computing devices

Granted: January 23, 2024
Patent Number: 11880248
A docking station for secondary external cooling for mobile computing devices, including: one or more ports for docking a mobile computing device; a docking platform for supporting the mobile computing device; a cooling element; and a thermal interface housed in the docking platform for transferring heat between the mobile computing device and the cooling element.

Error detection and correction in memory modules using programmable ECC engines

Granted: January 16, 2024
Patent Number: 11874739
A memory module includes one or more programmable ECC engines that may be programed by a host processing element with a particular ECC implementation. As used herein, the term “ECC implementation” refers to ECC functionality for performing error detection and subsequent processing, for example using the results of the error detection to perform error correction and to encode corrupted data that cannot be corrected, etc. The approach allows an SoC designer or company to program and…

Coherent block read fulfillment

Granted: January 16, 2024
Patent Number: 11874783
A coherent memory fabric includes a plurality of coherent master controllers and a coherent slave controller. The plurality of coherent master controllers each include a response data buffer. The coherent slave controller is coupled to the plurality of coherent master controllers. The coherent slave controller, responsive to determining a selected coherent block read command is guaranteed to have only one data response, sends a target request globally ordered message to the selected…

Mechanism to efficiently rinse memory-side cache of dirty data

Granted: January 16, 2024
Patent Number: 11874774
A method includes, in response to each write request of a plurality of write requests received at a memory-side cache device coupled with a memory device, writing payload data specified by the write request to the memory-side cache device, and when a first bandwidth availability condition is satisfied, performing a cache write-through by writing the payload data to the memory device, and recording an indication that the payload data written to the memory-side cache device matches the…