Dual purpose millimeter wave frequency band transmitter
Granted: July 23, 2024
Patent Number:
12044774
Systems, apparatuses, and methods for implementing a dual-purpose millimeter-wave frequency band transmitter are disclosed. A system includes a dual-purpose transmitter sending a video stream over a wireless link to a receiver. In some embodiments, the video stream is generated as part of an augmented reality (AR) or virtual reality (VR) application. The transmitter operates in a first mode to scan and map an environment of the transmitter and receiver. The transmitter generates radio…
Adaptive batch reuse on deep memories
Granted: July 16, 2024
Patent Number:
12039450
A method of adaptive batch reuse includes prefetching, from a CPU to a GPU, a first plurality of mini-batches comprising a subset of a training dataset. The GPU trains the neural network for the current epoch by reusing, without discard, the first plurality of mini-batches in training the neural network for the current epoch based on a reuse count value. The GPU also runs a validation set to identify a validation error for the current epoch. If the validation error for the current epoch…
Loader and runtime operations for heterogeneous code objects
Granted: July 16, 2024
Patent Number:
12039344
Described herein are techniques for executing a heterogeneous code object executable. According to the techniques, a loader identifies a first memory appropriate for loading a first architecture-specific portion of the heterogeneous code object executable, wherein the first architecture specific portion includes instructions for a first architecture, identifies a second memory appropriate for loading a second architecture-specific portion of the heterogeneous code object executable,…
Processor with multiple fetch and decode pipelines
Granted: July 16, 2024
Patent Number:
12039337
A processor employs a plurality of fetch and decode pipelines by dividing an instruction stream into instruction blocks with identified boundaries. The processor includes a branch predictor that generates branch predictions. Each branch prediction corresponds to a branch instruction and includes a prediction that the corresponding branch is to be taken or not taken. In addition, each branch prediction identifies both an end of the current branch prediction window and the start of another…
Memory controller with a plurality of command sub-queues and corresponding arbiters
Granted: July 16, 2024
Patent Number:
12038856
A memory controller includes a memory channel controller that uses multiple groups of command queue and arbiter pairs. Each arbiter is coupled to a respective command queue to select memory access commands from each command queue according to predetermined criteria. Each arbiter selects from among the memory access requests in each command queue independently based on the predetermined criteria and sends selected memory access requests to a selector that serves as a second level arbiter…
A/D bit storage, processing, and modes
Granted: July 16, 2024
Patent Number:
12038847
A/D bit storage, processing, and mode management techniques through use of a dense A/D bit representation are described. In one example, a memory management unit employs an A/D bit representation generation module to generate the dense A/D bit representation. In an implementation, the A/D bit representation is stored adjacent to existing page table structures of the multilevel page table hierarchy. In another example, memory management unit supports use of modes as part of A/D bit…
User configurable hardware settings for overclocking
Granted: July 16, 2024
Patent Number:
12038779
User configurable hardware settings for overclocking is described. In accordance with the described techniques, user input to adjust hardware settings for operating a processing unit in an overclocking mode is received. The user input, for example, adjusts at least one of a voltage droop threshold or a frequency adjustment of the clock rate. A voltage droop is detected while operating the processing unit in the overclocking mode. Responsive to detecting the voltage droop, a clock rate of…
Interconnect architecture for three-dimensional processing systems
Granted: July 9, 2024
Patent Number:
12033714
A processing system includes a plurality of processor cores formed in a first layer of an integrated circuit device and a plurality of partitions of memory formed in one or more second layers of the integrated circuit device. The one or more second layers are deployed in a stacked configuration with the first layer. Each of the partitions is associated with a subset of the processor cores that have overlapping footprints with the partitions. The processing system also includes first…
Combination scheme for baseline wander, direct current level shifting, and receiver linear equalization for high speed links
Granted: July 9, 2024
Patent Number:
12034440
Systems, apparatuses, and methods for implementing a combo scheme for direct current (DC) level shifting of signals are disclosed. A receiver circuit receives an input signal on a first interface. The first interface is coupled to a resistor in parallel with a capacitor which passes the input signal to a second interface. Also, the first interface is coupled to a first pair of current sources between ground and a voltage source, and the second interface is coupled to a second pair of…
Split read port latch array bit cell
Granted: July 9, 2024
Patent Number:
12033721
An apparatus and method for providing efficient floor planning, power, and performance tradeoffs of memory accesses. Adjacent bit cells in a column of an array use a split read port such that the bit cells do not share a read bit line while sharing a write bit line. The adjacent bit cells include asymmetrical read access circuits that convey data stored by latch circuitry of a corresponding bit cell to a corresponding read bit line. The layout of adjacent bit cells provides a number of…
System and methods for efficient execution of a collaborative task in a shader system
Granted: July 9, 2024
Patent Number:
12033275
Methods and systems are disclosed for executing a collaborative task in a shader system. Techniques disclosed include receiving, by the system, input data and computing instructions associated with the collaborative task, as well as a configuration setting, causing the system to operate in a takeover mode. The system then launches, exclusively in one workgroup processor, a workgroup including wavefronts configured to execute the collaborative task.
Dead surface invalidation
Granted: July 9, 2024
Patent Number:
12033239
Systems, apparatuses, and methods for performing dead surface invalidation are disclosed. An application sends draw call commands to a graphics processing unit (GPU) via a driver, with the draw call commands rendering to surfaces. After it is determined that a given surface will no longer be accessed by subsequent draw calls, the application sends a surface invalidation command for the given surface to a command processor of the GPU. After the command processor receives the surface…
Register compaction with early release
Granted: July 9, 2024
Patent Number:
12033238
Systems, apparatuses, and methods for implementing register compaction with early release are disclosed. A processor includes at least a command processor, a plurality of compute units, a plurality of registers, and a control unit. Registers are statically allocated to wavefronts by the control unit when wavefronts are launched by the command processor on the compute units. In response to determining that a first set of registers, previously allocated to a first wavefront, are no longer…
Method and apparatus for predicting kernel tuning parameters
Granted: July 9, 2024
Patent Number:
12033035
A processing device, which improves processing performance, is provided which comprises memory configured to store data and a processor, in communication with the memory. The processor is configured to receive tuning parameters, each having a numeric value, for executing a portion of a program on an identified hardware device and convert the numeric values of the tuning parameters to words. The processor is also configured to predict, using one or more machine language learning…
Partial sorting for coherency recovery
Granted: July 9, 2024
Patent Number:
12032967
Devices and methods for partial sorting for coherence recovery are provided. The partial sorting is efficiently executed by utilizing existing hardware along the memory path (e.g., memory local to the compute unit). The devices include an accelerated processing device which comprises memory and a processor. The processor is, for example, a compute unit of a GPU which comprises a plurality of SIMD units and is configured to determine, for data entries each comprising a plurality of bits,…
Throttling while managing upstream resources
Granted: July 9, 2024
Patent Number:
12032965
Systems, apparatuses, and methods for arbitrating threads in a computing system are disclosed. A computing system includes a processor with multiple cores, each capable of simultaneously processing instructions of multiple threads. When a thread throttling unit receives an indication that a shared cache has resource contention, the throttling unit sets a threshold number of cache misses for the cache. If the number of cache misses exceeds this threshold, then the throttling unit notifies…
System and method for self-resizing associative probabilistic hash-based data structures
Granted: July 9, 2024
Patent Number:
12032548
A method of maintaining a probabilistic filter includes, in response to receiving a key K1 for adding to the probabilistic filter, generating a fingerprint F1 based on applying a fingerprint hash function HF to the key K1, identifying an initial bucket Bi1 by selecting between at least a first bucket B1 determined based on a first bucket hash function H1 of the key K1 and a second bucket B2 determined based on a second bucket hash function H2 of the key K1, and inserting the fingerprint…
Access log and address translation log for a processor
Granted: July 9, 2024
Patent Number:
12032487
A processor maintains an access log indicating a stream of cache misses at a cache of the processor. In response to each of at least a subset of cache misses at the cache, the processor records a corresponding entry in the access log, indicating a physical memory address of the memory access request that resulted in the corresponding miss. In addition, the processor maintains an address translation log that indicates a mapping of physical memory addresses to virtual memory addresses. In…
System and method for storing cache location information for cache entry transfer
Granted: July 2, 2024
Patent Number:
12026099
A cache stores, along with data that is being transferred from a higher level cache to a lower level cache, information indicating the higher level cache location from which the data was transferred. Upon receiving a request for data that is stored at the location in the higher level cache, a cache controller stores the higher level cache location information in a status tag of the data. The cache controller then transfers the data with the status tag indicating the higher level cache…
Dynamic memory reconfiguration
Granted: July 2, 2024
Patent Number:
12026380
A processing system including a parallel processing unit selectively allocating pages of memory for interleaving across configurable subsets of channels based on a mode of allocation. In some embodiments, in a first mode, a page of memory is allocated to and interleaved across a plurality of channels, and in a second mode, a page of memory is allocated to and interleaved across a subset of the plurality of channels.