Semiconductor chip package with spring biased lid
Granted: April 18, 2023
Patent Number:
11631624
Various semiconductor chip packages are disclosed. In one aspect, a semiconductor chip package is provided that includes a package substrate that has a first edge and a second edge opposite to the first edge. A semiconductor chip is mounted on the package substrate. A thermal interface material is positioned on the semiconductor chip. A lid is positioned over the thermal interface material. A spring biasing mechanism is included that is operable to bias the lid away from the package…
Depth buffer pre-pass
Granted: April 18, 2023
Patent Number:
11631187
Systems, apparatuses, and methods for implementing a depth buffer pre-pass are disclosed. A rendering application uses a binning approach to render primitives of a virtual scene on a tile-by-tile basis, with each tile corresponding to a portion of the screen. The application causes a depth buffer pre-pass to be performed for the primitives of the tile before a pixel shader is invoked. During the depth buffer pre-pass, only the depth part of the virtual scene is rendered to determine…
Optimized asynchronous training of neural networks using a distributed parameter server with eager updates
Granted: April 18, 2023
Patent Number:
11630994
A method of training a neural network includes, at a local computing node, receiving remote parameters from a set of one or more remote computing nodes, initiating execution of a forward pass in a local neural network in the local computing node to determine a final output based on the remote parameters, initiating execution of a backward pass in the local neural network to determine updated parameters for the local neural network, and prior to completion of the backward pass,…
Suppressing cache line modification
Granted: April 18, 2023
Patent Number:
11630772
Disclosed is a system and method for use in a cache for suppressing modification of cache line. The system and method includes a processor and a memory operating cooperatively with a cache controller. The memory includes a coherence directory stored within a cache created to track at least one cache line in the cache via the cache controller. The processor instructs a cache controller to store a first data in a cache line in the cache. The cache controller tags the cache line based on…
Method and system for recording and logging error handling information
Granted: April 18, 2023
Patent Number:
11630715
A method and system for recording and logging errors in a computer system includes reading first error handling information with respect to a transaction. The first error handling information is stored in a first component, and based upon a condition of the storage in the first component, an oldest error information is evicted from the first component.
Dedicated vector sub-processor system
Granted: April 18, 2023
Patent Number:
11630667
A processor includes a plurality of vector sub-processors (VSPs) and a plurality of memory banks dedicated to respective VSPs. A first memory bank corresponding to a first VSP includes a first plurality of high vector general purpose register (VGPR) banks and a first plurality of low VGPR banks corresponding to the first plurality of high VGPR banks. The first memory bank further includes a plurality of operand gathering components that store operands from respective high VGPR banks and…
Hierarchical state save and restore for device with varying power states
Granted: April 18, 2023
Patent Number:
11630502
A disclosed technique includes triggering a change for a first set of one or more functional elements and for a second set of one or more functional elements from a high-power state to a low-power state; saving first state of the first set of one or more functional elements via a first set of one or more save-state elements; saving second state of the second set of one or more functional elements via a second set of one or more save-state elements; powering down the first set of one or…
Flexible circuit for droop detection
Granted: April 18, 2023
Patent Number:
11630161
A power supply monitor includes a delta-sigma modulator including an input receiving a binary number and an output providing a pulse-density modulated signal, the delta-sigma modulator operable to scale the pulse-density modulated signal based on the binary number. A fast droop detector circuit includes a level shifter providing the modulated signal referenced to a clean supply voltage. A lowpass filter is coupled between the level shifter and a comparator. The comparator produces a…
Low power and low latency GPU coprocessor for persistent computing
Granted: April 11, 2023
Patent Number:
11625807
Systems, apparatuses, and methods for implementing a graphics processing unit (GPU) coprocessor are disclosed. The GPU coprocessor includes a SIMD unit with the ability to self-schedule sub-wave procedures based on input data flow events. A host processor sends messages targeting the GPU coprocessor to a queue. In response to detecting a first message in the queue, the GPU coprocessor schedules a first sub-task for execution. The GPU coprocessor includes an inter-lane crossbar and…
Termination calibration scheme using a current mirror
Granted: April 11, 2023
Patent Number:
11626874
Systems, apparatuses, and methods for conveying and receiving information as electrical signals in a computing system are disclosed. A computing system includes multiple transmitters sending singled-ended data signals to multiple receivers. A termination voltage is generated and sent to the multiple receivers. The termination voltage is coupled to each of signal termination circuitry and signal sampling circuitry within each of the multiple receivers. Any change in the termination…
DRAM command streak management
Granted: April 11, 2023
Patent Number:
11625352
A memory controller includes a command queue and an arbiter for selecting entries from the command queue for transmission to a DRAM. The arbiter transacts streaks of consecutive read commands and streaks of consecutive write commands. The arbiter has a current mode indicating the type of commands currently being transacted, and a cross mode indicating the other type. The arbiter is operable to monitor commands in the command queue for the current mode and the cross mode, and in response…
Mechanism for reducing coherence directory controller overhead for near-memory compute elements
Granted: April 11, 2023
Patent Number:
11625251
A parallel processing (PP) level coherence directory, also referred to as a Processing In-Memory Probe Filter (PimPF), is added to a coherence directory controller. When the coherence directory controller receives a broadcast PIM command from a host, or a PIM command that is directed to multiple memory banks in parallel, the PimPF accelerates processing of the PIM command by maintaining a directory for cache coherence that is separate from existing system level directories in the…
Preserving memory ordering between offloaded instructions and non-offloaded instructions
Granted: April 11, 2023
Patent Number:
11625249
Preserving memory ordering between offloaded instructions and non-offloaded instructions is disclosed. An offload instruction for an operation to be offloaded is processed and a lock is placed on a memory address associated with the offload instruction. In response to completing a cache operation targeting the memory address, the lock on the memory address is removed. For multithreaded applications, upon determining that a plurality of processor cores have each begun executing a sequence…
Efficient calibration of circuits in tiled integrated circuits
Granted: April 4, 2023
Patent Number:
11619982
An integrated circuit includes a plurality of tiles receiving a power supply voltage, each having a corresponding analog circuit and operates in response to a first voltage, and a hardware controller receiving a voltage identification code and provides the first voltage to each of the plurality of tiles in response thereto. The hardware controller comprises a test time controller determining coefficients of a waveform that describes an average correspondence between the power supply…
Graphics texture footprint discovery
Granted: April 4, 2023
Patent Number:
11620788
Accesses to a mipmap by a shader in a graphics pipeline are monitored. The mipmap is stored in a memory or cache associated with the shader and the mipmap represents a texture at a hierarchy of levels of detail. A footprint in the mipmap of the texture is marked based on the monitored accesses. The footprint indicates, on a per-tile, per-level-of-detail (LOD) basis, tiles of the mipmap that are expected to be accessed in subsequent shader operations. In some cases, the footprint is…
Instruction cache prefetch throttle
Granted: April 4, 2023
Patent Number:
11620224
Techniques for controlling prefetching of instructions into an instruction cache are provided. The techniques include tracking either or both of branch target buffer misses and instruction cache misses, modifying a throttle toggle based on the tracking, and adjusting prefetch activity based on the throttle toggle.
Graphics texture footprint discovery
Granted: April 4, 2023
Patent Number:
11620788
Accesses to a mipmap by a shader in a graphics pipeline are monitored. The mipmap is stored in a memory or cache associated with the shader and the mipmap represents a texture at a hierarchy of levels of detail. A footprint in the mipmap of the texture is marked based on the monitored accesses. The footprint indicates, on a per-tile, per-level-of-detail (LOD) basis, tiles of the mipmap that are expected to be accessed in subsequent shader operations. In some cases, the footprint is…
Dropout for accelerated deep learning in heterogeneous architectures
Granted: April 4, 2023
Patent Number:
11620525
A heterogeneous processing system includes at least one central processing unit (CPU) core and at least one graphics processing unit (GPU) core. The heterogeneous processing system is configured to compute an activation for each one of a plurality of neurons for a first network layer of a neural network. The heterogeneous processing system randomly drops a first subset of the plurality of neurons for the first network layer and keeps a second subset of the plurality of neurons for the…
Optical bridge interconnect unit for adjacent processors
Granted: April 4, 2023
Patent Number:
11620248
A system and method for efficient data transfer in a computing system are described. A computing system includes multiple nodes that receive tasks to process. A bridge interconnect transfers data between two processing nodes without the aid of a system bus on the motherboard. One of the multiple bridge interconnects of the computing system is an optical bridge interconnect that transmits optical information across the optical bridge interconnect between two nodes. The receiving node uses…
Instruction cache prefetch throttle
Granted: April 4, 2023
Patent Number:
11620224
Techniques for controlling prefetching of instructions into an instruction cache are provided. The techniques include tracking either or both of branch target buffer misses and instruction cache misses, modifying a throttle toggle based on the tracking, and adjusting prefetch activity based on the throttle toggle.