High voltage tolerant capacitors
Granted: January 23, 2024
Patent Number:
11881450
A system and method for fabricating on-die metal-insulator-metal capacitors capable of supporting relatively high voltage applications and increasing capacitance per area are described. In various implementations, an integrated circuit includes multiple metal-insulator-metal (MIM) capacitors. The MIM capacitors are formed between two signal nets such as two different power rails, two different control signals, or two different data signals. The integrated circuit includes multiple…
Cross field effect transistor library cell architecture design
Granted: January 23, 2024
Patent Number:
11881393
A system and method for efficiently creating layout for memory bit cells are described. In various implementations, cells of a library use Cross field effect transistors (FETs) that include vertically stacked gate all around (GAA) transistors with conducting channels oriented in an orthogonal direction between them. The channels of the vertically stacked transistors use opposite doping polarities. A first category of cells includes devices where each of the two devices in a particular…
Hybrid render with deferred primitive batch binning
Granted: January 23, 2024
Patent Number:
11880926
A method, computer system, and a non-transitory computer-readable storage medium for performing primitive batch binning are disclosed. The method, computer system, and non-transitory computer-readable storage medium include techniques for generating a primitive batch from a plurality of primitives, computing respective bin intercepts for each of the plurality of primitives in the primitive batch, and shading the primitive batch by iteratively processing each of the respective bin…
Synchronization free cross pass binning through subpass interleaving
Granted: January 23, 2024
Patent Number:
11880924
A method of tiled rendering is provided which comprises dividing a frame to be rendered, into a plurality of tiles, receiving commands to execute a plurality of subpasses of the tiles and interleaving execution of same subpasses of multiple tiles of the frame. Interleaving execution of same subpasses of multiple tiles comprises executing a previously ordered first subpass of a second tile between execution of the previously ordered first subpass of a first tile and execution of a…
Using multiple functional blocks for training neural networks
Granted: January 23, 2024
Patent Number:
11880769
A system is described that performs training operations for a neural network, the system including an analog circuit element functional block with an array of analog circuit elements, and a controller. The controller monitors error values computed using an output from each of one or more initial iterations of a neural network training operation, the one or more initial iterations being performed using neural network data acquired from the memory. When one or more error values are less…
Method and system for opportunistic load balancing in neural networks using metadata
Granted: January 23, 2024
Patent Number:
11880715
Methods and systems for load balancing in a neural network system using metadata are disclosed. Any one or a combination of one or more kernels, one or more neurons, and one or more layers of the neural network system are tagged with metadata. A scheduler detects whether there are neurons that are available to execute. The scheduler uses the metadata to schedule and load balance computations across compute resources and available resources.
Packed 16 bits instruction pipeline
Granted: January 23, 2024
Patent Number:
11880683
Systems, apparatuses, and methods for efficiently processing arithmetic operations are disclosed. A computing system includes a processor capable of executing single precision mathematical instructions on data sizes of M bits and half precision mathematical instructions on data sizes of N bits, which is less than M bits. At least two source operands with M bits indicated by a received instruction are read from a register file. If the instruction is a packed math instruction, at least a…
Implementing heterogeneous wavefronts on a graphics processing unit (GPU)
Granted: January 16, 2024
Patent Number:
11875425
Implementing heterogeneous wavefronts on a graphics processing unit (GPU) is disclosed. A scheduler assigns heterogeneous wavefronts for execution on a compute unit of a processing device. The heterogeneous wavefronts include different types of wavefronts such as vector compute wavefronts and service-level wavefronts that vary in resource requirements and instruction sets. As one example, heterogeneous wavefronts may include scalar wavefronts and vector compute wavefronts that execute on…
Graded throttling for network-on-chip traffic
Granted: January 16, 2024
Patent Number:
11876718
Graded throttling for network-on-chip traffic, including: calculating, by an agent of a network-on-chip, a number of outstanding transactions issued by the agent; determining that the number of outstanding transactions meets a threshold; and implementing, by the agent, in response to the number of outstanding transactions meeting the threshold, a traffic throttling policy.
Variable tick for DRAM interface calibration
Granted: January 16, 2024
Patent Number:
11875875
Methods and systems are disclosed for calibrating, by a memory interface system, an interface with dynamic random-access memory (DRAM) using a dynamically changing training clock. Techniques disclosed comprise receiving a system clock having a clock signal at a first pulse rate. Then, during the training of the interface, techniques disclosed comprise generating a training clock from the clock signal at the first pulse rate, the training clock having a clock signal at a second pulse…
Management of thrashing in a GPU
Granted: January 16, 2024
Patent Number:
11875197
Systems, apparatuses, and methods for managing a number of wavefronts permitted to concurrently execute in a processing system. An apparatus includes a register file with a plurality of registers and a plurality of compute units configured to execute wavefronts. A control unit of the apparatus is configured to allow a first number of wavefronts to execute concurrently on the plurality of compute units. The control unit is configured to allow no more than a second number of wavefronts to…
Coherent block read fulfillment
Granted: January 16, 2024
Patent Number:
11874783
A coherent memory fabric includes a plurality of coherent master controllers and a coherent slave controller. The plurality of coherent master controllers each include a response data buffer. The coherent slave controller is coupled to the plurality of coherent master controllers. The coherent slave controller, responsive to determining a selected coherent block read command is guaranteed to have only one data response, sends a target request globally ordered message to the selected…
Mechanism to efficiently rinse memory-side cache of dirty data
Granted: January 16, 2024
Patent Number:
11874774
A method includes, in response to each write request of a plurality of write requests received at a memory-side cache device coupled with a memory device, writing payload data specified by the write request to the memory-side cache device, and when a first bandwidth availability condition is satisfied, performing a cache write-through by writing the payload data to the memory device, and recording an indication that the payload data written to the memory-side cache device matches the…
Error detection and correction in memory modules using programmable ECC engines
Granted: January 16, 2024
Patent Number:
11874739
A memory module includes one or more programmable ECC engines that may be programed by a host processing element with a particular ECC implementation. As used herein, the term “ECC implementation” refers to ECC functionality for performing error detection and subsequent processing, for example using the results of the error detection to perform error correction and to encode corrupted data that cannot be corrected, etc. The approach allows an SoC designer or company to program and…
Multi-adaptive cache replacement policy
Granted: January 9, 2024
Patent Number:
11868221
Techniques for performing cache operations are provided. The techniques include tracking performance events for a plurality of test sets of a cache, detecting a replacement policy change trigger event associated with a test set of the plurality of test sets, and in response to the replacement policy change trigger event, operating non-test sets of the cache according to a replacement policy associated with the test set.
Compacted addressing for transaction layer packets
Granted: January 9, 2024
Patent Number:
11868778
Compacted addressing for transaction layer packets, including: determining, for a first epoch, one or more low entropy address bits in a plurality of first transaction layer packets; removing, from one or more memory addresses of one or more second transaction layer packets, the one or more low entropy address bits; and sending the one or more second transaction layer packets.
Processor-guided execution of offloaded instructions using fixed function operations
Granted: January 9, 2024
Patent Number:
11868777
Processor-guided execution of offloaded instructions using fixed function operations is disclosed. Instructions designated for remote execution by a target device are received by a processor. Each instruction includes, as an operand, a target register in the target device. The target register may be an architected virtual register. For each of the plurality of instructions, the processor transmits an offload request in the order that the instructions are received. The offload request…
Shader source code performance prediction
Granted: January 9, 2024
Patent Number:
11868759
Shader source code performance prediction is described. In accordance with the described techniques, an update to shader source code for implementing a shader is received. A prediction of performance of the shader on a processing unit is generated based on the update to the shader source code. Feedback about the update is output. The feedback includes the prediction of performance of the shader. In one or more implementations, generating the prediction of performance of the shader…
Processing-in-memory concurrent processing system and method
Granted: January 9, 2024
Patent Number:
11868306
A processing system includes a processing unit and a memory device. The memory device includes a processing-in-memory (PIM) module that performs processing operations on behalf of the processing unit. An instruction set architecture (ISA) of the PIM module has fewer instructions than an ISA of the processing unit. Instructions received from the processing unit are translated such that processing resources of the PIM module are virtualized. As a result, the PIM module concurrently…
Using epoch counter values for controlling the retention of cache blocks in a cache
Granted: January 9, 2024
Patent Number:
11868254
An electronic device includes a cache, a memory, and a controller. The controller stores an epoch counter value in metadata for a location in the memory when a cache block evicted from the cache is stored in the location. The controller also controls how the cache block is retained in the cache based at least in part on the epoch counter value when the cache block is subsequently retrieved from the location and stored in the cache.