Automatic voltage reconfiguration
Granted: September 6, 2022
Patent Number:
11435806
Automatic voltage reconfiguration in a computer processor including one or more cores includes executing one or more user-specified workloads; determining, based on the user-specified workloads, a respective minimum safe voltage for each core of one or more cores; and modifying a respective voltage configuration for each core of the one or more cores based on the respective minimum safe voltage.
Bounding volume hierarchy compression
Granted: May 3, 2022
Patent Number:
11321903
A technique for performing ray tracing operations is provided. The technique includes receiving a ray for an intersection test, testing the ray against boxes specified in a bounding volume hierarchy to eliminate one or more boxes or triangles from consideration, unpacking a triangle from a compressed triangle block of the bounding volume hierarchy, the compressed triangle block including two or more triangles that share at least one vertex, and testing the ray for intersection against at…
Selecting cache aging policy for prefetches based on cache test regions
Granted: May 3, 2022
Patent Number:
11321245
A cache controller applies an aging policy to a portion of a cache based on access metrics for different test regions of the cache, whereby each test region implements a different aging policy. The aging policy for each region establishes an initial age value for each entry of the cache, and a particular aging policy can set the age for a given entry based on whether the entry was placed in the cache in response to a demand request from a processor core or in response to a prefetch…
Techniques to improve translation lookaside buffer reach by leveraging idle resources
Granted: May 3, 2022
Patent Number:
11321241
Techniques are disclosed for processing address translations. The techniques include detecting a first miss for a first address translation request for a first address translation in a first translation lookaside buffer, in response to the first miss, fetching the first address translation into the first translation lookaside buffer and evicting a second address translation from the translation lookaside buffer into an instruction cache or local data share memory, detecting a second miss…
Integrated circuit product customizations for identification code visibility
Granted: April 26, 2022
Patent Number:
11315883
An apparatus includes a substrate including an identification code on a first side of the substrate and near a perimeter of the substrate. The apparatus includes a stiffener structure attached to the first side of the substrate. The stiffener structure has a cutout in an outer perimeter of the stiffener structure. The stiffener structure is oriented with respect to the substrate to cause the cutout to expose the identification code. The cutout may have a first dimension and a second…
Region based split-directory scheme to adapt to large cache sizes
Granted: April 26, 2022
Patent Number:
11314646
Systems, apparatuses, and methods for maintaining region-based cache directories split between node and memory are disclosed. The system with multiple processing nodes includes cache directories split between the nodes and memory to help manage cache coherency among the nodes' cache subsystems. In order to reduce the number of entries in the cache directories, the cache directories track coherency on a region basis rather than on a cache line basis, wherein a region includes multiple…
System and method for multiplexer tree indexing
Granted: April 19, 2022
Patent Number:
11308057
Described herein is a system and method for multiplexer tree (muxtree) indexing. Muxtree indexing performs hashing and row reduction in parallel by use of each select bit only once in a particular path of the muxtree. The muxtree indexing generates a different final index as compared to conventional hashed indexing but still results in a fair hash, where all table entries get used with equal distribution with uniformly random selects.
Semi-sorting compression with encoding and decoding tables
Granted: April 19, 2022
Patent Number:
11309911
A data processing platform, method, and program product perform compression and decompression of a set of data items. Suffix data and a prefix are selected for each respective data item in the set of data items based on data content of the respective data item. The set of data items is sorted based on the prefixes. The prefixes are encoded by querying multiple encoding tables to create a code word containing compressed information representing values of all prefixes for the set of data…
Semiconductor chip with solder cap probe test pads
Granted: April 19, 2022
Patent Number:
11309222
Various semiconductor chips with solder capped probe test pads are disclosed. In accordance with one aspect of the present invention, a semiconductor chip is provided that includes a substrate, plural input/output (I/O) structures on the substrate and plural test pads on the substrate. Each of the test pads includes a first conductor pad and a first solder cap on the first conductor pad.
Compressing texture data on a per-channel basis
Granted: April 19, 2022
Patent Number:
11308648
Sampling circuitry independently accesses channels of texture data that represent a set of pixels. One or more processing units separately compress the channels of the texture data and store compressed data representative of the channels of the texture data for the set of pixels. The channels can include a red channel, a blue channel, and a green channel that represent color values of the set of pixels and an alpha channel that represents degrees of transparency of the set of pixels.…
Dynamic remapping of virtual address ranges using remap vector
Granted: April 19, 2022
Patent Number:
11307993
For one or more stages of execution of a software application at a first processor, a remap vector of a second processor is reconfigured to represent a dynamic mapping of virtual address groups to physical address groups for that stage. Each bit position of the remap vector is configured to store a value indicating whether a corresponding virtual address group is actively mapped to a corresponding physical address group. Address translation operations issued during a stage of execution…
Dynamic voltage frequency scaling based on active memory barriers
Granted: April 19, 2022
Patent Number:
11307631
A processing unit includes compute units partitioned into one or islands that are provided with operating voltages and clock signals having clock frequencies independent of providing operating voltages or clock signals to other islands of compute units. The processing unit also includes dynamic voltage and frequency scaling (DVFS) hardware configured to compute one or more numbers of active memory barriers in the one or more islands. The DVFS hardware is also configured to modify the…
System and method for page-conscious GPU instruction
Granted: April 12, 2022
Patent Number:
11301256
Embodiments disclose a system and method for reducing virtual address translation latency in a wide execution engine that implements virtual memory. One example method describes a method comprising receiving a wavefront, classifying the wavefront into a subset based on classification criteria selected to reduce virtual address translation latency associated with a memory support structure, and scheduling the wavefront for processing based on the classifying.
Spatial partitioning in a multi-tenancy graphics processing unit
Granted: April 5, 2022
Patent Number:
11295507
A graphics processing unit (GPU) or other apparatus includes a plurality of shader engines. The apparatus also includes a first front end (FE) circuit and one or more second FE circuits. The first FE circuit is configured to schedule geometry workloads for the plurality of shader engines in a first mode. The first FE circuit is configured to schedule geometry workloads for a first subset of the plurality of shader engines and the one or more second FE circuits are configured to schedule…
Memory request throttling to constrain memory bandwidth utilization
Granted: April 5, 2022
Patent Number:
11294810
A processing system includes an interconnect fabric coupleable to a local memory and at least one compute cluster coupled to the interconnect fabric. The compute cluster includes a processor core and a cache hierarchy. The cache hierarchy has a plurality of caches and a throttle controller configured to throttle a rate of memory requests issuable by the processor core based on at least one of an access latency metric and a prefetch accuracy metric. The access latency metric represents an…
Self-regulating power management for a neural network system
Granted: April 5, 2022
Patent Number:
11294747
A neural network runs a known input data set using an error free power setting and using an error prone power setting. The differences in the outputs of the neural network using the two different power settings determine a high level error rate associated with the output of the neural network using the error prone power setting. If the high level error rate is excessive, the error prone power setting is adjusted to reduce errors by changing voltage and/or clock frequency utilized by the…
Shared resource allocation in a multi-threaded microprocessor
Granted: April 5, 2022
Patent Number:
11294724
An approach is provided for allocating a shared resource to threads in a multi-threaded microprocessor based upon the usefulness of the shared resource to each of the threads. The usefulness of a shared resource to a thread is determined based upon the number of entries in the shared resource that are allocated to the thread and the number of active entries that the thread has in the shared resource. Threads that are allocated a large number of entries in the shared resource and have a…
Thread switch for accesses to slow memory
Granted: April 5, 2022
Patent Number:
11294710
A processing system suspends execution of a program thread based on an access latency required for a program thread to access memory. The processing system employs different memory modules having different memory technologies, located at different points in the processing system, and the like, or a combination thereof. The different memory modules therefore have different access latencies for memory transactions (e.g., memory reads and writes). When a program thread issues a memory…
Scheduler queue assignment
Granted: April 5, 2022
Patent Number:
11294678
Systems, apparatuses, and methods for implementing scheduler queue assignment logic are disclosed. A processor includes at least a decode unit, scheduler queue assignment logic, scheduler queues, pickers, and execution units. The assignment logic receives a plurality of operations from a decode unit in each clock cycle. The assignment logic includes a separate logical unit for each different type of operation which is executable by the different execution units of the processor. For each…
Enhanced atomics for workgroup synchronization
Granted: March 29, 2022
Patent Number:
11288095
A technique for synchronizing workgroups is provided. The techniques comprise detecting that one or more non-executing workgroups are ready to execute, placing the one or more non-executing workgroups into one or more ready queues based on the synchronization status of the one or more workgroups, detecting that computing resources are available for execution of one or more ready workgroups, and scheduling for execution one or more ready workgroups from the one or more ready queues in an…