AMD Patent Grants

Early culling for ray tracing

Granted: February 1, 2022
Patent Number: 11238640
A technique for performing ray tracing operations is provided. The technique includes reading descendant-shared type metadata for a non-leaf node of a bounding volume hierarchy; identifying one or more culling types for a ray-intersection test for a ray; and determining whether to treat the non-leaf node as not intersected based on whether the one or more culling types includes at least one type specified by the descendant-shared type metadata.

Method and apparatus for controlling cache line storage in cache memory

Granted: February 1, 2022
Patent Number: 11237972
A method and apparatus physically partitions clean and dirty cache lines into separate memory partitions, such as one or more banks, so that during low power operation, a cache memory controller reduces power consumption of the cache memory containing the clean only data. The cache memory controller controls a refresh operation so that a data refresh does not occur for the clean data only banks or the refresh rate is reduced for the clean data only banks. Partitions that store dirty data…

Method for a reliability, availability, and serviceability-conscious huge page support

Granted: February 1, 2022
Patent Number: 11237928
A method includes reserving memory capacity in a first memory device as patch memory region for backing faulted memory, receiving a memory error indication indicating an uncorrectable error in a faulted segment in a second memory device and, in response to the memory error indication, associating in a remapping table the faulted segment with a patch segment in the patch memory region. The faulted segment is smaller than a memory page size of the second memory device. The method also…

Arithemetic logic unit register sequencing

Granted: February 1, 2022
Patent Number: 11237827
A graphics processing unit (GPU) sequences provision of operands to a set of operand registers, thereby allowing the GPU to share at least one of the operand registers between processing. The GPU includes a plurality of arithmetic logic units (ALUs) with at least one of the ALUs configured to perform double precision operations. The GPU further includes a set of operand registers configured to store single precision operands. For a plurality of executing threads that request double…

Linear, low-latency power supply monitor

Granted: February 1, 2022
Patent Number: 11237220
In one form, a power supply monitor including a current controlled oscillator circuit, a time-to-digital converter, and an output divider. The current controlled oscillator circuit has an input for receiving a power supply voltage to be measured, and an output for providing a frequency signal having a frequency linearly proportional to the power supply voltage. The time-to-digital converter has an input coupled to the output of the current controlled oscillator circuit, and an output for…

Heterogeneous parallel primitives programming model

Granted: January 25, 2022
Patent Number: 11231962
With the success of programming models such as OpenCL and CUDA, heterogeneous computing platforms are becoming mainstream. However, these heterogeneous systems are low-level, not composable, and their behavior is often implementation defined even for standardized programming models. In contrast, the method and system embodiments for the heterogeneous parallel primitives (HPP) programming model disclosed herein provide a flexible and composable programming platform that guarantees…

In memory logic functions using memory arrays

Granted: January 25, 2022
Patent Number: 11233510
Systems, apparatuses, and methods for efficiently performing operations system are disclosed. A computing system uses a memory for storing data, and one or more processing units. The memory includes multiple rows for storing the data with each intersection of a row and a column being a memory bit cell. The memory processes operations. For particular operations, the two or more operands are accessed simultaneously for generating a result without being read out and stored. Two indications…

Methods and devices for testing multiple memory configurations

Granted: January 25, 2022
Patent Number: 11232847
Methods, devices, and systems for testing a number of combinations of memory in a computer system. A modular memory device is installed in a memory channel in communication with a processor. The modular memory device includes a number of memory storage devices. The number of memory storage devices include a number of pins. A subset of the number of memory storage devices is selected. A subset of the plurality of pins which do not correspond to the subset of the number of memory storage…

Data flow in a distributed graphics processing unit architecture

Granted: January 25, 2022
Patent Number: 11232622
An apparatus includes a command buffer configured to temporarily store commands. The apparatus also includes processing units disposed at a substrate. The processing units are configured to access a plurality of copies of a command from the command buffer. The processing units include first processing units (such as fixed function hardware blocks) to perform geometry operations indicated by the command on a set of primitives. The geometry operations are performed concurrently by the…

Cache for storing regions of data

Granted: January 25, 2022
Patent Number: 11232039
Systems, apparatuses, and methods for efficiently performing memory accesses in a computing system are disclosed. A computing system includes one or more clients, a communication fabric and a last-level cache implemented with low latency, high bandwidth memory. The cache controller for the last-level cache determines a range of addresses corresponding to a first region of system memory with a copy of data stored in a second region of the last-level cache. The cache controller sends a…

Mechanism for mitigating information leak via cache side channels during speculative execution

Granted: January 25, 2022
Patent Number: 11231931
A processor includes a first core and a second core to execute computer instructions. Each of the cores includes its own private memory cache and speculative load queue. The speculative load queue stores cachelines for the computer instructions and data when the core is operating in a speculative state with respect to a process or thread. The processor includes a state tracking buffer having a state field to store a speculative exclusive ownership state for each cacheline in the…

Static random access memory read path with latch

Granted: January 18, 2022
Patent Number: 11227651
A read path for reading data from a memory includes a sense amplifier having data (SAT) and data complement (SAC) output nodes and a latch. The latch includes an input tri-state inverter including first and second PMOS transistors connected between VDD and an intermediate node, and first and second NMOS transistors connected between VSS and the intermediate node. A gate connection of the first PMOS and NMOS transistors is connected to the SAT node; a gate connection of the second PMOS…

Memory bandwidth reduction techniques for low power convolutional neural network inference applications

Granted: January 18, 2022
Patent Number: 11227214
Systems, apparatuses, and methods for implementing memory bandwidth reduction techniques for low power convolutional neural network inference applications are disclosed. A system includes at least a processing unit and an external memory coupled to the processing unit. The system detects a request to perform a convolution operation on input data from a plurality of channels. Responsive to detecting the request, the system partitions the input data from the plurality of channels into 3D…

Using a bloom filter to reduce the number of memory addressees tracked by a coherence directory

Granted: January 18, 2022
Patent Number: 11226900
An approach for tracking data stored in caches uses a Bloom filter to reduce the number of addresses that need to be tracked by a coherence directory. When a requested address is determined to not be currently tracked by either the coherence directory or the Bloom filter, tracking of the address is initiated in the Bloom filter, but not in the coherence directory. Initiating tracking of the address in the Bloom filter includes setting hash bits in the Bloom filter so that subsequent…

Selective prefetching in multithreaded processing units

Granted: January 18, 2022
Patent Number: 11226819
A processing unit includes a plurality of processing elements and one or more caches. A first thread executes a program that includes one or more prefetch instructions to prefetch information into a first cache. Prefetching is selectively enabled when executing the first thread on a first processing element dependent upon whether one or more second threads previously executed the program on the first processing element. The first thread is then dispatched to execute the program on the…

Re-purposing byte enables as clock enables for power savings

Granted: January 11, 2022
Patent Number: 11223575
Systems, apparatuses, and methods for efficient data transfer in a computing system are disclosed. A source generates packets to send across a communication fabric (or fabric) to a destination. The source generates partition enable signals for the partitions of payload data. The source negates an enable signal for a particular partition when the source determines the packet type indicates the particular partition should have an associated asserted enable signal in the packet, but the…

Refresh management for DRAM

Granted: January 11, 2022
Patent Number: 11222685
A memory controller interfaces with a dynamic random access memory (DRAM) over a memory channel. A refresh control circuit monitors an activate counter which counts a rolling number of activate commands sent over the memory channel to a memory region of the DRAM. In response to the activate counter being above an intermediate management threshold value, the refresh control circuit only issue a refresh management (RFM) command if there is no REF command currently held at the refresh…

Error handling for resilient software

Granted: January 11, 2022
Patent Number: 11221902
Error handling for resilient software includes: receiving data indicating a region of resilient memory; detecting an error associated with a region of memory; and preventing raising an exception for the error in response to the region of memory falling within the region of resilient memory by preventing the region of memory as being identified as including the error.

Self refresh state machine mop array

Granted: January 11, 2022
Patent Number: 11221772
A system includes a memory system comprising a memory module and a processor adapted to access the memory module using a memory controller that includes a controller having an input for receiving a power state change request signal and an output for providing memory operations, and a memory operation array comprising a plurality of entries. Each entry includes a plurality of encoded fields. The memory operation array is programmable to store different sequences of commands for particular…

Memory access commands with near-memory address generation

Granted: January 4, 2022
Patent Number: 11216373
A memory controller may be configured with command logic that is capable of sending a memory access command having incomplete address information via a command/address bus that connects the memory controller to memory modules. The memory controller may send the memory access command via the bus for accessing data stored at memory locations of the memory modules. The memory locations may correspond to different near-memory generated reflecting that the data is not address aligned across…