Memory access commands with near-memory address generation
Granted: January 4, 2022
Patent Number:
11216373
A memory controller may be configured with command logic that is capable of sending a memory access command having incomplete address information via a command/address bus that connects the memory controller to memory modules. The memory controller may send the memory access command via the bus for accessing data stored at memory locations of the memory modules. The memory locations may correspond to different near-memory generated reflecting that the data is not address aligned across…
Loop exit predictor
Granted: January 4, 2022
Patent Number:
11216279
A processor includes a prediction engine coupled to a training engine. The prediction engine includes a loop exit predictor. The training engine includes a loop exit branch monitor coupled to a loop detector. Based on at least one of a plurality of call return levels, the loop detector of the processor takes a snapshot of a retired predicted block during a first retirement time, compares the snapshot to a subsequent retired predicted block at a second retirement time, and based on the…
Dynamic, variable bit-width numerical precision on field-programmable gate arrays for machine learning tasks
Granted: January 4, 2022
Patent Number:
11216250
A method includes providing a set of one or more computational units implemented in a set of one or more field programmable gate array (FPGA) devices, where the set of one or more computational units is configured to generate a plurality of output values based on one or more input values. The method further includes, for each computational unit of the set of computational units, performing a first calculation in the computational unit using a first number representation, where a first…
Modifying an operating state of a processing unit based on waiting statuses of blocks
Granted: January 4, 2022
Patent Number:
11216052
A processing unit includes a plurality of components configured to execute instructions and a controller. The controller is configured to determine a power consumption of the processing unit, determine a waiting status of the processing unit based on waiting statuses of components, and selectively modify an operating state of the processing unit based on the waiting status and the power consumption of the processing unit. In some cases, the operating state is modified in response to a…
Probe interrupt delivery
Granted: December 28, 2021
Patent Number:
11210246
Systems, apparatuses, and methods for routing interrupts on a coherency probe network are disclosed. A computing system includes a plurality of processing nodes, a coherency probe network, and one or more control units. The coherency probe network carries coherency probe messages between coherent agents. Interrupts that are detected by a control unit are converted into messages that are compatible with coherency probe messages and then routed to a target destination via the coherency…
Side information for video data transmission
Granted: December 28, 2021
Patent Number:
11212537
Systems, apparatuses, and methods for performing efficient video compression are disclosed. A video processing system includes a transmitter sending a video stream over a wireless link to a receiver. The transmitter includes a processor and an encoder. The processor generates rendered blocks of pixels of a video frame, and when the processor predicts a compression level for a given region of the video frame is different from a compression level for immediately neighboring blocks, the…
Molded die last chip combination
Granted: December 28, 2021
Patent Number:
11211332
Various multi-die arrangements and methods of manufacturing the same are disclosed. In one aspect, a method of manufacturing a semiconductor chip device is provided. A redistribution layer (RDL) structure is fabricated with a first side and second side opposite to the first side. An interconnect chip is mounted on the first side of the RDL structure. A first semiconductor chip and a second semiconductor chip are mounted on the second side of the RDL structure after mounting the…
Standard cell layout architectures and drawing styles for 5nm and beyond
Granted: December 28, 2021
Patent Number:
11211330
A system and method for efficiently creating layout for a standard cell are described. A standard cell to be used for an integrated circuit uses a full trench silicide strap as drain regions for a pmos transistor and an nmos transistor. Multiple unidirectional routes in metal zero are placed across the standard cell where each route connects to a trench silicide contact. Power and ground connections utilize pins rather than end-to-end rails in the standard cell. Additionally,…
GPU packet aggregation system
Granted: December 28, 2021
Patent Number:
11210757
A graphics processing unit (GPU) includes a packet management component that automatically aggregates data from input packets. In response to determining that a received first input packet does not indicate a send condition, and in response to determining that a generated output packet would be smaller than an output size threshold, the packet management component aggregates data corresponding to the first input packet with data corresponding to a second input packet stored at a packet…
System direct memory access engine offload
Granted: December 28, 2021
Patent Number:
11210248
Systems, devices, and methods for direct memory access. A system direct memory access (SDMA) device disposed on a processor die sends a message which includes physical addresses of a source buffer and a destination buffer, and a size of a data transfer, to a data fabric device. The data fabric device sends an instruction which includes the physical addresses of the source and destination buffer, and the size of the data transfer, to first agent devices. Each of the first agent devices…
Cache access measurement deskew
Granted: December 28, 2021
Patent Number:
11210234
A processor includes a cache having two or more test regions and a larger non-test region. The processor further includes a cache controller that applies different cache replacement policies to the different test regions of the cache, and a performance monitor that measures performance metrics for the different test regions, such as a cache hit rate at each test region. Based on the performance metrics, the cache controller selects a cache replacement policy for the non-test region, such…
High dynamic range for head-mounted display device
Granted: December 28, 2021
Patent Number:
11209899
A technique for adjusting the brightness values of images to be displayed on a stereoscopic head mounted display is provided herein. This technique improves the perceived dynamic range of the head mounted display by dynamically adjusting the pixel intensities (also known generally as “exposure”) of the images presented on the head mounted display based on a detected gaze direction. The head mounted display includes an eye tracker that is able to sense the gaze directions of the eyes.…
Memory with expandable row width
Granted: December 21, 2021
Patent Number:
11205477
A method for operating a memory device includes initiating an access operation to a corresponding row of an array of bit cells of the memory device. Responsive to an expansion mode signal having a first state, the method further includes dynamically operating each column of a plurality of columns of the array to access each bit cell of a corresponding row within the plurality of columns during the access operation. Alternatively, responsive to the expansion mode state signal having a…
System performance management using prioritized compute units
Granted: December 21, 2021
Patent Number:
11204871
Methods, devices, and systems for managing performance of a processor having multiple compute units. An effective number of the multiple compute units may be determined to designate as having priority. On a condition that the effective number is nonzero, the effective number of the multiple compute units may each be designated as a priority compute unit. Priority compute units may have access to a shared cache whereas non-priority compute units may not. Workgroups may be preferentially…
Thermal management using variation of thermal resistance of thermal interface
Granted: December 14, 2021
Patent Number:
11201104
A thermal management system includes an integrated circuit having an active side including a control circuit and a backside including a first set of electrodes distributed across the backside. The thermal management system includes a heat exchanger having a surface including a second set of electrodes. The thermal management system includes a thermal interface material including thermally conductive particles suspended in a fluid. The thermal interface material is disposed between the…
Texture processor based ray tracing acceleration method and system
Granted: December 14, 2021
Patent Number:
11200724
A texture processor based ray tracing accelerator method and system are described. The system includes a shader, texture processor (TP) and cache, which are interconnected. The TP includes a texture address unit (TA), a texture cache processor (TCP), a filter pipeline unit and a ray intersection engine. The shader sends a texture instruction which contains ray data and a pointer to a bounded volume hierarchy (BVH) node to the TA. The TCP uses an address provided by the TA to fetch BVH…
Data integrity for persistent memory systems and the like
Granted: December 14, 2021
Patent Number:
11200106
A data processing system includes a memory channel, a memory coupled to the memory channel, and a data processor. The data processor is coupled to the memory channel and accesses the memory over the memory channel using a packet structure defining a plurality of commands and having corresponding address bits, data bits, and user bits. The data processor communicates with the memory over the memory channel using a first type of error code. In response to a write access request, the data…
Broadcast synchronization for dynamically adaptable arrays
Granted: December 14, 2021
Patent Number:
11200060
An array processor includes processor element arrays (PEAs) distributed in rows and columns. The PEAs are configured to perform operations on parameter values. A first sequencer received a first direct memory access (DMA) instruction that includes a request to read data from at least one address in memory. A texture address (TA) engine requests the data from the memory based on the at least one address and a texture data (TD) engine provides the data to the PEAs. The PEAs provide first…
Apparatus and method for providing workload distribution of threads among multiple compute units
Granted: December 7, 2021
Patent Number:
11194634
In some examples, thermal aware optimization logic determines a characteristic (e.g., a workload or type) of a wavefront (e.g., multiple threads). For example, the characteristic indicates whether the wavefront is compute intensive, memory intensive, mixed, and/or another type of wavefront. The thermal aware optimization logic determines temperature information for one or more compute units (CUs) in one or more processing cores. The temperature information includes predictive thermal…
System and method for application migration for a dockable device
Granted: December 7, 2021
Patent Number:
11194740
Described is a method and apparatus for application migration between a dockable device and a docking station in a seamless manner. The dockable device includes a processor and the docking station includes a high-performance processor. The method includes determining a docking state of a dockable device while at least an application is running. Application migration from the dockable device to a docking station is initiated when the dockable device is moving to a docked state.…