Interconnect architecture for three-dimensional processing systems
Granted: July 9, 2024
Patent Number:
12033714
A processing system includes a plurality of processor cores formed in a first layer of an integrated circuit device and a plurality of partitions of memory formed in one or more second layers of the integrated circuit device. The one or more second layers are deployed in a stacked configuration with the first layer. Each of the partitions is associated with a subset of the processor cores that have overlapping footprints with the partitions. The processing system also includes first…
System and methods for efficient execution of a collaborative task in a shader system
Granted: July 9, 2024
Patent Number:
12033275
Methods and systems are disclosed for executing a collaborative task in a shader system. Techniques disclosed include receiving, by the system, input data and computing instructions associated with the collaborative task, as well as a configuration setting, causing the system to operate in a takeover mode. The system then launches, exclusively in one workgroup processor, a workgroup including wavefronts configured to execute the collaborative task.
Dead surface invalidation
Granted: July 9, 2024
Patent Number:
12033239
Systems, apparatuses, and methods for performing dead surface invalidation are disclosed. An application sends draw call commands to a graphics processing unit (GPU) via a driver, with the draw call commands rendering to surfaces. After it is determined that a given surface will no longer be accessed by subsequent draw calls, the application sends a surface invalidation command for the given surface to a command processor of the GPU. After the command processor receives the surface…
Register compaction with early release
Granted: July 9, 2024
Patent Number:
12033238
Systems, apparatuses, and methods for implementing register compaction with early release are disclosed. A processor includes at least a command processor, a plurality of compute units, a plurality of registers, and a control unit. Registers are statically allocated to wavefronts by the control unit when wavefronts are launched by the command processor on the compute units. In response to determining that a first set of registers, previously allocated to a first wavefront, are no longer…
Method and apparatus for predicting kernel tuning parameters
Granted: July 9, 2024
Patent Number:
12033035
A processing device, which improves processing performance, is provided which comprises memory configured to store data and a processor, in communication with the memory. The processor is configured to receive tuning parameters, each having a numeric value, for executing a portion of a program on an identified hardware device and convert the numeric values of the tuning parameters to words. The processor is also configured to predict, using one or more machine language learning…
Partial sorting for coherency recovery
Granted: July 9, 2024
Patent Number:
12032967
Devices and methods for partial sorting for coherence recovery are provided. The partial sorting is efficiently executed by utilizing existing hardware along the memory path (e.g., memory local to the compute unit). The devices include an accelerated processing device which comprises memory and a processor. The processor is, for example, a compute unit of a GPU which comprises a plurality of SIMD units and is configured to determine, for data entries each comprising a plurality of bits,…
Throttling while managing upstream resources
Granted: July 9, 2024
Patent Number:
12032965
Systems, apparatuses, and methods for arbitrating threads in a computing system are disclosed. A computing system includes a processor with multiple cores, each capable of simultaneously processing instructions of multiple threads. When a thread throttling unit receives an indication that a shared cache has resource contention, the throttling unit sets a threshold number of cache misses for the cache. If the number of cache misses exceeds this threshold, then the throttling unit notifies…
System and method for self-resizing associative probabilistic hash-based data structures
Granted: July 9, 2024
Patent Number:
12032548
A method of maintaining a probabilistic filter includes, in response to receiving a key K1 for adding to the probabilistic filter, generating a fingerprint F1 based on applying a fingerprint hash function HF to the key K1, identifying an initial bucket Bi1 by selecting between at least a first bucket B1 determined based on a first bucket hash function H1 of the key K1 and a second bucket B2 determined based on a second bucket hash function H2 of the key K1, and inserting the fingerprint…
Access log and address translation log for a processor
Granted: July 9, 2024
Patent Number:
12032487
A processor maintains an access log indicating a stream of cache misses at a cache of the processor. In response to each of at least a subset of cache misses at the cache, the processor records a corresponding entry in the access log, indicating a physical memory address of the memory access request that resulted in the corresponding miss. In addition, the processor maintains an address translation log that indicates a mapping of physical memory addresses to virtual memory addresses. In…
Lookup table optimization for high speed transmit feed-forward equalization link
Granted: July 2, 2024
Patent Number:
12028190
A driver circuit includes a feed-forward equalization (FFE) circuit. The FFE circuit receives a plurality of pulse-amplitude modulation (PAM) symbol values to be transmitted at one of multiple PAM levels. The FFE circuit includes a first partial lookup table, one or more additional partial lookup tables, and an adder circuit. The first partial lookup table contains partial finite impulse-response (FIR) values and indexed based on a current PAM symbol value, a precursor PAM symbol value,…
DRAM row management for processing in memory
Granted: July 2, 2024
Patent Number:
12026401
In accordance with described techniques for DRAM row management for processing in memory, a plurality of instructions are obtained for execution by a processing in memory component embedded in a dynamic random access memory. An instruction is identified that last accesses a row of the dynamic random access memory, and a subsequent instruction is identified that first accesses an additional row of the dynamic random access memory. A first command is issued to close the row and a second…
Page swapping to protect memory devices
Granted: July 2, 2024
Patent Number:
12026387
A page swapping memory protection system tracks accesses to physical memory pages, such as in a table with each row storing a physical memory page address and a counter value. This counter value records the number of accesses (e.g., read access or write accesses) to the corresponding physical memory page. In response to one of the counters exceeding a threshold value, the corresponding physical memory page is swapped with another page in physical memory (e.g., a page the table indicates…
Dynamic memory reconfiguration
Granted: July 2, 2024
Patent Number:
12026380
A processing system including a parallel processing unit selectively allocating pages of memory for interleaving across configurable subsets of channels based on a mode of allocation. In some embodiments, in a first mode, a page of memory is allocated to and interleaved across a plurality of channels, and in a second mode, a page of memory is allocated to and interleaved across a subset of the plurality of channels.
System and method for storing cache location information for cache entry transfer
Granted: July 2, 2024
Patent Number:
12026099
A cache stores, along with data that is being transferred from a higher level cache to a lower level cache, information indicating the higher level cache location from which the data was transferred. Upon receiving a request for data that is stored at the location in the higher level cache, a cache controller stores the higher level cache location information in a status tag of the data. The cache controller then transfers the data with the status tag indicating the higher level cache…
Build process for application performance
Granted: July 2, 2024
Patent Number:
12026080
Systems and methods for building applications by automatically incorporating application performance data into the application build process are disclosed. By capturing build settings and performance data from prior applications being executed on different computing systems such as bare metal and virtualized cloud instances, a performance database may be maintained and used to predict build settings that improve application performance (e.g., on a specific computing system or computing…
System and method to reduce power down entry and exit latency
Granted: June 25, 2024
Patent Number:
12019499
A system and method for fast save/restore is disclosed. The system and method include one or more logical units (LUs) residing in independent power domains, one or more digital frequency synthesizers (DFS), each of the one or more DFS associated with one of the one or more LUs, the one or more DFSs configured to lock a system complex frequency and ramp the one or more LUs to system complex frequency, and one or more slave fast save/restore control (FSRC) units, each slave FSRC unit…
Alleviating interconnect traffic in a disaggregated memory system
Granted: June 25, 2024
Patent Number:
12019904
One or both of read and write accesses to a fabric-attached memory module via a fabric interconnect are monitored. In one or more implementations, offloading of one or more tasks accessing the fabric-attached memory module to a processor of a routing system associated with the fabric-attached memory module is initiated based on the read and write accesses to the fabric-attached memory module. Additionally or alternatively, replicating memory of the fabric-attached memory module to a…
Feed forward training of memory interfaces
Granted: June 25, 2024
Patent Number:
12019876
A data processor, system, method, integrated circuit are provided which update timing values for accessing a memory to compensate for voltage and temperature (VT) drift during operation. The method includes performing a link retraining sequence for a plurality of DQ lanes of the memory bus and determining a first phase offset based on the link retraining. The method includes calculating a second offset based on the first offset, applying the second offset to a plurality of command CA…
Arbitrating atomic memory operations
Granted: June 25, 2024
Patent Number:
12019566
Arbitrating atomic memory operations, including: receiving, by a media controller, a plurality of atomic memory operations; determining, by an atomics controller associated with the media controller, based on one or more arbitration rules, an ordering for issuing the plurality of atomic memory operations; and issuing the plurality of atomic memory operations to a memory module according to the ordering.
Virtual partitioning a processor-in-memory (“PIM”)
Granted: June 25, 2024
Patent Number:
12019560
Process isolation for a PIM device includes: receiving, from a process, a call to allocate a virtual address space where the process stores a PIM configuration context; allocating the virtual address space including mapping a physical address space including PIM device configuration registers to the virtual address space only if the physical address space is not mapped to another process's virtual address space; and programming the PIM device configuration space according to the…