AMD Patent Applications

METHOD AND APPARATUS TO ACCELERATE RENDERING OF GRAPHICS IMAGES

Granted: July 20, 2017
Application Number: 20170206625
Described is a method and apparatus to accelerate rendering of 3D graphics images. When rendering, the transformation matrix (or equivalent) used for projecting primitives is modified so that a resulting image is smaller and/or warped compared to a regular unmodified rendering. The effect of such transformation is fewer pixels being rendered and thus a better performance. To compute the final image, the warped image is rectified by an inverse transformation. Depending on the warping…

MEMORY MANAGEMENT IN GRAPHICS AND COMPUTE APPLICATION PROGRAMMING INTERFACES

Granted: July 20, 2017
Application Number: 20170206630
Methods are provided for creating objects in a way that permits an API client to explicitly participate in memory management for an object created using the API. Methods for managing data object memory include requesting memory requirements for an object using an API and expressly allocating a memory location for the object based on the memory requirements. Methods are also provided for cloning objects such that a state of the object remains unchanged from the original object to the…

METHOD AND APPARATUS FOR PERFORMING HIGH THROUGHPUT TESSELLATION

Granted: July 6, 2017
Application Number: 20170193697
A method, a system, and a computer-readable storage medium directed to performing high-speed parallel tessellation of 3D surface patches are disclosed. The method includes generating a plurality of primitives in parallel. Each primitive in the plurality is generated by a sequence of functional blocks, in which each sequence acts independently of all the other sequences.

TEXTURE COMPRESSION TECHNIQUES

Granted: July 6, 2017
Application Number: 20170195683
A texture compression method is described. The method comprises splitting an original texture having a plurality of pixels into original blocks of pixels. Then, for each of the original blocks of pixels, a partition is identified that has one or more disjoint subsets of pixels whose union is the original block of pixels. The original block of pixels is further subdivided into one or more subsets according to the identified partition. Finally, each subset is independently compressed to…

HARDWARE ACCURACY COUNTERS FOR APPLICATION PRECISION AND QUALITY FEEDBACK

Granted: June 29, 2017
Application Number: 20170185409
Methods, devices, and systems for capturing an accuracy of an instruction executing on a processor. An instruction may be executed on the processor, and the accuracy of the instruction may be captured using a hardware counter circuit. The accuracy of the instruction may be captured by analyzing bits of at least one value of the instruction to determine a minimum or maximum precision datatype for representing the field, and determining whether to adjust a value of the hardware counter…

DATA DRIVEN SCHEDULER ON MULTIPLE COMPUTING CORES

Granted: June 29, 2017
Application Number: 20170185451
Methods, devices, and systems for data driven scheduling of a plurality of computing cores of a processor. A plurality of threads may be executed on the plurality of computing cores, according to a default schedule. The plurality of threads may be analyzed, based on the execution, to determine correlations among the plurality of threads. A data driven schedule may be generated based on the correlations. The plurality of threads may be executed on the plurality of computing cores…

REGION PROBE FILTER FOR DISTRIBUTED MEMORY SYSTEM

Granted: June 22, 2017
Application Number: 20170177484
A probe filter determines whether to issue a probe to at least one other processing node in response to a memory access request, and includes a region probe filter directory, a line probe filter directory, and a controller. The region probe filter directory identifies regions of memory for which at least one cache line may be cached in a data processing system and a state of each region, wherein a size of each region corresponds to a plurality of cache lines. The line probe filter…

METHOD AND SYSTEM FOR USING SOLID STATE DEVICE AS EVICTION PAD FOR GRAPHICS PROCESSING UNIT

Granted: June 22, 2017
Application Number: 20170178275
Described is a method and system for using a solid state device (SSD) as an eviction pad for graphics processing units (GPUs). The method for eviction processing includes a processor that determines when a dedicated memory associated with a GPU and a host memory associated with the processor are congested. The processor sends a content transfer command to the SSD. The SSD initiates a content transfer directly with the dedicated memory associated with the GPU. The GPU transfers the…

METHOD AND APPARATUS FOR PERFORMING INTER-LANE POWER MANAGEMENT

Granted: June 15, 2017
Application Number: 20170168546
A method and apparatus for performing inter-lane power management includes de-energizing one or more execution lanes upon a determination that the one or more execution lanes are to be predicated. Energy from the predicated execution lanes is redistributed to one or more active execution lanes.

METHOD AND APPARATUS FOR TIME-BASED SCHEDULING OF TASKS

Granted: June 8, 2017
Application Number: 20170161114
A computing device is disclosed. The computing device includes an Accelerated Processing Unit (APU) including at least a first Heterogeneous System Architecture (HSA) computing device and at least a second HSA computing device, the second computing device being a different type than the first computing device, and an HSA Memory Management Unit (HMMU) allowing the APU to communicate with at least one memory. The computing task is enqueued on an HSA-managed queue that is set to run on the…

SYSTEM AND METHOD FOR APPLICATION MIGRATION

Granted: June 8, 2017
Application Number: 20170161212
Described is a method and apparatus for application migration between a dockable device and a docking station in a seamless manner. The dockable device includes a processor and the docking station includes a high-performance processor. The method includes determining a docking state of a dockable device while at least an application is running. Application migration from the dockable device to a docking station is initiated when the dockable device is moving to a docked state.…

REDUCING POWER NEEDED TO SEND SIGNALS OVER WIRES

Granted: June 8, 2017
Application Number: 20170163282
Methods and apparatus are described. A method, implemented in a decoder, includes receiving two or more signals from an encoder over two or more respective wires. At least one of the two or more signals includes at least one code that was recoded by the encoder. The decoder receives a recoding table. The recoding table provides a mapping indicating the recoding for each code that was recoded by the encoder in the received two or more signals. The decoder decodes the two or more received…

METHOD AND APPARATUS FOR PERFORMING A PARALLEL SEARCH OPERATION

Granted: May 25, 2017
Application Number: 20170147608
A method and apparatus for performing a search in a processor-in-memory (PIM) system having a first processor and at least one memory module includes receiving one or more images by the first processor. The first processor sends a query for a search of memory for a matching image to the one or more images to at least one memory module, which searches memory in the memory module, in response to the received query. The at least one memory module sends the results of the search to the first…

EFFICIENT PROCESSOR LOAD BALANCING USING PREDICATION

Granted: May 18, 2017
Application Number: 20170139748
A system and methods embodying some aspects of the present embodiments for efficient load balancing using predication flags are provided. The load balancing system includes a first processing unit, a second processing unit, and a shared queue. The first processing unit is in communication with a first queue. The second processing unit is in communication with a second queue. The first and second queues are each configured to hold a packet. The shared queue is configured to maintain a…

METHOD AND SYSTEMS OF CONTROLLING MEMORY-TO-MEMORY COPY OPERATIONS

Granted: May 4, 2017
Application Number: 20170123670
A memory-to-memory copy operation control system includes a processor configured to receive an instruction to perform a memory-to-memory copy operation and a memory module network in communication with the processor. The memory module network has a plurality of memory modules that include a proximal memory module in direct communication with the processor and one or more additional memory modules in communication with the processor via the proximal memory module. The system also includes…

MINIMIZING LATENCY FROM PERIPHERAL DEVICES TO COMPUTE ENGINES

Granted: April 13, 2017
Application Number: 20170102886
Methods, systems, and computer program products are provided for minimizing latency in a implementation where a peripheral device is used as a capture device and a compute device such as a GPU processes the captured data in a computing environment. In embodiments, a peripheral device and GPU are tightly integrated and communicate at a hardware/firmware level. Peripheral device firmware can determine and store compute instructions specifically for the GPU, in a command queue. The compute…

Method and Apparatus for Workload Placement on Heterogeneous Systems

Granted: April 13, 2017
Application Number: 20170102971
The methods and apparatus can assign processing core workloads to processing cores from a heterogeneous instruction set architectures (ISA) pool of available processing cores based on processing core metric results. For example, the method and apparatus can obtain processing core metric results for one or more processing cores, such as processing cores within general purpose processors, from a heterogeneous ISA pool of available processing cores. The method and apparatus can also obtain…

MULTI-PROTOCOL HEADER GENERATION SYSTEM

Granted: March 23, 2017
Application Number: 20170085472
A communication device includes a data source that generates data for transmission over a bus, and a data encoder that receives and encodes outgoing data. An encoder system receives outgoing data from a data source and stores the outgoing data in a first queue. An encoder encodes outgoing data with a header type that is based upon a header type indication from a controller and stores the encoded data that may be a packet or a data word with at least one layered header in a second queue…

PREEMPTIVE CONTEXT SWITCHING OF PROCESSES ON AN ACCELERATED PROCESSING DEVICE (APD) BASED ON TIME QUANTA

Granted: March 16, 2017
Application Number: 20170076421
Methods and apparatus are described. A method includes an accelerated processing device running a process. When a maximum time interval during which the process is permitted to run expires before the process completes, the accelerated processing device receives an operating-system-initiated instruction to stop running the process. The accelerated processing device stops the process from running in response to the received operating-system-initiated instruction.

GRAPHICS LIBRARY EXTENSIONS

Granted: March 2, 2017
Application Number: 20170061670
Methods for enabling graphics features in processors are described herein. Methods are provided to enable trinary built-in functions in the shader, allow separation of the graphics processor's address space from the requirement that all textures must be physically backed, enable use of a sparse buffer allocated in virtual memory, allow a reference value used for stencil test to be generated and exported from a fragment shader, provide support for use specific operations in the stencil…