Platform agnostic atomic operations
Granted: October 4, 2022
Patent Number:
11461045
A processing unit is configured to access a first memory that supports atomic operations and a second memory via an interface. The second memory or the interface does not support atomicity of the atomic operations. A trap handler is configured to trap atomic operations and enforce atomicity of the trapped atomic operations. The processing unit selectively provides atomic operations to the trap handler in response to detecting that memory access requests in the atomic operations are…
System and method for controlling electrical current supply in a multi-processor core system via instruction per cycle reduction
Granted: October 4, 2022
Patent Number:
11460879
Methods and apparatuses control electrical current supplied to a plurality of processing units in a multi-processor system. A plurality of current usage information corresponding to the processing units are received by a controller to determine a threshold current for each of the processing units. The controller determines a frequency reduction action and an instructions-per-cycle (IPC) reduction action for the each of the processing units based on the threshold current and regulates…
Variable precision computing system
Granted: September 27, 2022
Patent Number:
11455766
A processor selectively adjusts the precision of data for different functional units. Specified functional units of the processor, such as shader processing unit of a graphics processing unit (GPU) include a zeroing module to store, based on the states of corresponding precision flags, a data value of zero at specified portion of an input and/or output data operand. The functional unit then processes the data including the zeroed portion. Because a portion of the data has been zeroed,…
Multi-class multi-label classification using clustered singular decision trees for hardware adaptation
Granted: September 27, 2022
Patent Number:
11455252
Techniques for generating a model for predicting when different hybrid prefetcher configurations should be used are disclosed. Techniques for using the model to predict when different hybrid prefetcher configurations should be used are also disclosed. The techniques for generating the model include obtaining a set of input data, and generating trees based on the training data. Each tree is associated with a different hybrid prefetcher configuration and the trees output certainty scores…
Enhanced durability for systems on chip (SOCs)
Granted: September 27, 2022
Patent Number:
11455251
A system-on-chip with runtime global push to persistence includes a data processor having a cache, an external memory interface, and a microsequencer. The external memory interface is coupled to the cache and is adapted to be coupled to an external memory. The cache provides data to the external memory interface for storage in the external memory. The microsequencer is coupled to the data processor. In response to a trigger signal, the microsequencer causes the cache to flush the data by…
Dynamic instances semantics
Granted: September 27, 2022
Patent Number:
11455153
A computing system includes a processor and a memory storing instructions for a compiler that, when executed by the processor, cause the processor to generate a control flow graph of program source code by receiving the program source code in the compiler, in the compiler, generating a structure point representation based on the received program source code by inserting into the program source code a set of structure points including an anchor structure point and a join structure point…
Power state transitions
Granted: September 27, 2022
Patent Number:
11455025
A computer processing device transitions among a plurality of power management states and at least one power management sub-state. From a first state, it is determined whether an entry condition for a third state is satisfied. If the entry condition for the third state is satisfied, the third state is entered. If the entry condition for the third state is not satisfied, it is determined whether an entry condition for the first sub-state is satisfied. If the entry condition for the first…
Setting values of portions of registers based on bit values
Granted: September 20, 2022
Patent Number:
11451241
A processor employs a set of bits to indicate values of portions of registers of a register file. In response to a specified instruction indicating an expected change of instruction types to be executed, the processor sets one or more of the bits and, for subsequent instructions, interprets corresponding portions of the registers as having a specified value (e.g., zero). By employing the set of bits to set the values of the register portions, rather than setting the individual portions…
System and method for providing system level sleep state power savings
Granted: September 20, 2022
Patent Number:
11449346
A system for providing system level sleep state power savings includes a plurality of memory channels and corresponding plurality of memories coupled to respective memory channels. The system includes one or more processors operative to receive information indicating that a system level sleep state is to be entered and in response to receiving the system level sleep indication, moves data stored in at least a first of the plurality of memories to at least a second of the plurality of…
Secure computer vision processing
Granted: September 13, 2022
Patent Number:
11443051
A computer vision processor in an image cluster defines a fenced memory region (FMR) that controls access to image data stored in a first portion of a trusted memory region (TMR). The computer vision processor receives FMR requests from an application implemented in a processing cluster. The FMR requests are to access the image data in the first portion of the TMR. The computer vision processor selectively allows the requesting application to access the image data. In some cases, the…
Controlling prediction functional blocks used by a branch predictor in a processor
Granted: September 13, 2022
Patent Number:
11442727
An electronic device includes a processor, a branch predictor in the processor, and a predictor controller in the processor. The branch predictor includes multiple prediction functional blocks, each prediction functional block configured for generating predictions for control transfer instructions (CTIs) in program code based on respective prediction information, the branch predictor configured to select, from among predictions generated by the prediction functional blocks for each CTI,…
Separate clocking for components of a graphics processing unit
Granted: September 13, 2022
Patent Number:
11442495
Systems and methods related to controlling clock signals for clocking shader engines modules (SEs) and non-shader-engine modules (nSEs) of a graphics processing unit (GPU) are provided. One or more dividers receive a clock signal CLK and output a clock signal CLKA to the SEs and output a clock signal CLKB to the nSEs. The frequencies of CLKA and CLKB are independently selected based on sets of performance counter data monitored at the SEs and nSEs, respectively. The clock signal…
Automatic voltage reconfiguration
Granted: September 6, 2022
Patent Number:
11435806
Automatic voltage reconfiguration in a computer processor including one or more cores includes executing one or more user-specified workloads; determining, based on the user-specified workloads, a respective minimum safe voltage for each core of one or more cores; and modifying a respective voltage configuration for each core of the one or more cores based on the respective minimum safe voltage.
Offset-aligned three-dimensional integrated circuit
Granted: September 6, 2022
Patent Number:
11437359
A method for manufacturing a three-dimensional integrated circuit includes attaching a first side of a first die to a first carrier wafer. The method includes preparing a second side of the first die to generate a prepared second side of the first die. The method includes attaching the prepared second side of the first die to a second carrier wafer. The method includes removing the first carrier wafer from the first side of the first die to form a transitional three-dimensional…
Folded cell layout for 6T SRAM cell
Granted: September 6, 2022
Patent Number:
11437316
A layout for a 6T SRAM cell is disclosed. The cell layout takes a conventional 6T SRAM cell layout and restructures the layout into a more square cell layout with a single p-channel and a single n-channel across the width of the cell. Restructuring the cell layout reduces the height of wordlines and allows dual wordlines to be placed in the cell to reduce wordline resistance in the cell. Dual pairs of bitlines may also be placed in separate metal layers in the cell layout to reduce…
Automatic part testing
Granted: September 6, 2022
Patent Number:
11436114
Automatic part testing includes: booting a part under testing into a first operating environment; executing, via the first operating environment, one or more test patterns on the part; performing a comparison between one or more observed characteristics associated with the one or more test patterns and one or more expected characteristics; and modifying one or more operational parameters of a central processing unit of the part based on the comparison.
Proactive management of inter-GPU network links
Granted: September 6, 2022
Patent Number:
11436060
Systems, apparatuses, and methods for proactively managing inter-processor network links are disclosed. A computing system includes at least a control unit and a plurality of processing units. Each processing unit of the plurality of processing units includes a compute module and a configurable link interface. The control unit dynamically adjusts a clock frequency and a link width of the configurable link interface of each processing unit based on a data transfer size and layer…
Techniques for improving operand caching
Granted: September 6, 2022
Patent Number:
11436016
A technique for determining whether a register value should be written to an operand cache or whether the register value should remain in and not be evicted from the operand cache is provided. The technique includes executing an instruction that accesses an operand that comprises the register value, performing one or both of a lookahead technique and a prediction technique to determine whether the register value should be written to an operand cache or whether the register value should…
Optimizing runtime alias checks
Granted: September 6, 2022
Patent Number:
11435987
Optimizing runtime alias checks includes identifying, by a compiler, a base pointer and a plurality of different memory accesses based on the base pointer in a code loop; generating, by the compiler, a first portion of runtime code to determine a minimum access and a maximum access of the plurality of different memory accesses; and generating, by the compiler, a second portion of runtime code including one or more runtime alias checks for the minimum access and one or more runtime alias…
Neural network power management in a multi-GPU system
Granted: September 6, 2022
Patent Number:
11435813
Systems, apparatuses, and methods for managing power consumption for a neural network implemented on multiple graphics processing units (GPUs) are disclosed. A computing system includes a plurality of GPUs implementing a neural network. In one implementation, the plurality of GPUs draw power from a common power supply. To prevent the power consumption of the system from exceeding a power limit for long durations, the GPUs coordinate the scheduling of tasks of the neural network. At least…