Nvidia Patent Grants

Assigning priorities to computational work streams by mapping desired execution priorities to device priorities

Granted: April 25, 2017
Patent Number: 9632834
One embodiment sets forth a method for assigning priorities to kernels launched by a software application and executed within a stream of work on a parallel processing subsystem. First, the software application assigns a desired priority to a stream using a call included in the API. The API receives this call and passes it to a driver. The driver maps the desired priority to an appropriate device priority associated with the parallel processing subsystem. Subsequently, if the software…

Lazy runahead operation for a microprocessor

Granted: April 25, 2017
Patent Number: 9632976
Embodiments related to managing lazy runahead operations at a microprocessor are disclosed. For example, an embodiment of a method for operating a microprocessor described herein includes identifying a primary condition that triggers an unresolved state of the microprocessor. The example method also includes identifying a forcing condition that compels resolution of the unresolved state. The example method also includes, in response to identification of the forcing condition, causing the…

Method and system for reducing a polygon bounding box

Granted: April 25, 2017
Patent Number: 9633458
In a graphics processing pipeline, a processing unit establishes a bounding box around a polygon in order to identify sample points that are covered by the polygon. For a given sample point included within the bounding box, the processing unit constructs a set of lines that intersect at the sample point, where each line in the set of lines is parallel to at least one side of the polygon. When all vertices of the polygon reside on one side of at least one line in the set of lines, the…

Conservative rasterization of primitives using an error term

Granted: April 25, 2017
Patent Number: 9633469
A system, method, and computer program product are provided for conservative rasterization of primitives using an error term. In use, an edge equation is determined for each edge of a primitive, the edge equation having coefficients defining the edge of the primitive. Each edge of the primitive is shifted to enlarge the primitive by modifying coefficients of the edge equation defining the edge by an error term that is a predetermined amount. Pixels that intersect the primitive are then…

Shaped register file reads

Granted: April 18, 2017
Patent Number: 9626191
One embodiment of the present invention sets forth a technique for performing a shaped access of a register file that includes a set of N registers, wherein N is greater than or equal to two. The technique involves, for at least one thread included in a group of threads, receiving a request to access a first amount of data from each register in the set of N registers, and configuring a crossbar to allow the at least one thread to access the first amount of data from each register in the…

Graphics processing unit sharing between many applications

Granted: April 18, 2017
Patent Number: 9626216
A technique for executing a plurality of applications on a GPU. The technique involves establishing a first connection to a first application and a second connection to a second application, establishing a universal processing context that is shared by the first application and the second application, transmitting a first workload pointer to a first queue allocated to the first application, the first workload pointer pointing to a first workload generated by the first application,…

Technique for scaling the bandwidth of a processing element to match the bandwidth of an interconnect

Granted: April 18, 2017
Patent Number: 9626320
A transmitter is configured to scale up a low bandwidth delivered by a first processing element to match a higher bandwidth associated with an interconnect. A receiver is configured to scale down the high bandwidth delivered by the interconnect to match the lower bandwidth associated with a second processing element. The first processing element and the second processing element may thus communicate with one another across the interconnect via the transmitter and the receiver,…

8-transistor dual-ported static random access memory

Granted: April 18, 2017
Patent Number: 9627021
An 8-transistor SRAM (static random access memory) storage cell provides differential read bit lines that are precharged to a low voltage level for read operations. The 8-transistor storage cell provides separate ports for read and write operations, including differential read bit lines. Prior to each read operation, the differential read bit lines are precharged to the low voltage level. During read operations, one of the two differential read bit lines is pulled high towards a high…

Navigation device

Granted: April 18, 2017
Patent Number: 9628705
In one embodiment, a car navigation device is provided. The device comprises: at least one wide-angle camera; a video correction unit for acquiring video data from the wide-angle lens and correcting the video data; a video merging unit for acquiring corrected video data from video correction unit and merging the corrected video data; an image recognition unit for acquiring video from the video merging unit and performing image recognition to the video; and a driving assistant unit for…

Reducing system power consumption due to USB host controllers

Granted: April 11, 2017
Patent Number: 9619004
Circuits, methods, and apparatus that reduce the power consumed by data transfers initiated by a USB host controller. Peripheral devices on a USB network are accessed with a reduced frequency in order to save power dissipated by a CPU and other circuits when reading data needed by the host controller. Instead of possibly accessing devices each frame, peripheral devices are accessed during some frames, and not accessed during others. A USB host controller may have two or more modes, such…

Method and system for bin coalescing for parallel divide-and-conquer sorting algorithms

Granted: April 11, 2017
Patent Number: 9619204
A system and method for performing sorting. The method includes partitioning a plurality of keys needing sorting into a first plurality of bins, wherein the bins are sequentially sorted. The plurality of keys is capable of being sorted into a sequence of keys using a corresponding ordering system. The method includes coalescing a first pair of consecutive bins, such that when coalesced the first pair of bins falls below a threshold. The method also includes ordering keys in the first…

Grouping and analysis of data access hazard reports

Granted: April 11, 2017
Patent Number: 9619364
A method for analyzing race conditions between multiple threads of an application is disclosed. The method comprises accessing hazard records for an application under test. It further comprises creating a graph comprising a plurality of vertices and a plurality of edges using the hazard records, wherein each vertex of the graph comprises information about a code location of a hazard and wherein each edge of the graph comprises hazard information between one or more vertices.…

Light transport simulator and method of construction and classification of light transport paths

Granted: April 11, 2017
Patent Number: 9619930
A light transport simulator and a method of constructing and classifying light transport paths. One embodiment of the light transport simulator includes a light transport simulator operable to construct and classify a light transport path between two points in a scene, including: (1) a memory configured to store dual deterministic finite automata (DFA) based on an LPE that defines criteria for accepting the light transport path, and (2) a processor configured to employ the dual DFA to…

Method and system of curve fitting for common focus measures

Granted: April 11, 2017
Patent Number: 9621780
An efficient method and system for estimating an optimal focus position for capturing an image are presented. Embodiments of the present invention initially determine an initial lens position dataset. Then, scores are calculated for each value of the initial lens position dataset producing a plurality of scores. Embodiments of the present invention then determine an optimum focus position through interpolation and extrapolation by relating the initial lens position dataset to the score…

Confluence analysis and loop fast-forwarding for improving SIMD execution efficiency

Granted: April 4, 2017
Patent Number: 9612811
One embodiment of the present invention sets forth a method for causing thread convergence. The method includes determining that a control flow graph representing a first section of a program includes at least two non-overlapping paths that extend from a first divergent node to a candidate node. The method also includes determining that the first divergent node is not a dominator of the candidate node or that the candidate node is not a post-dominator of the first divergent node. The…

System, method, and computer program product for implementing software-based scoreboarding

Granted: April 4, 2017
Patent Number: 9612836
A system, method, and computer program product are provided for implementing a software-based scoreboarding mechanism. The method includes the steps of receiving a dependency barrier instruction that includes an immediate value and an identifier corresponding to a first register and, based on a comparison of the immediate value to the value stored in the first register, dispatching a subsequent instruction to at least a first processing unit of two or more processing units.

Higher accuracy Z-culling in a tile-based architecture

Granted: April 4, 2017
Patent Number: 9612839
A graphics processing pipeline configured for z-cull operations. The graphics processing pipeline comprising a screen-space pipeline and a tiling unit. The screen-space pipeline includes a z-cull unit configured to perform z-culling operations. The tiling unit is configured to determine that a first set of primitives overlaps a first cache tile. The tiling unit is also configured to transmit the first set of primitives to the screen-space pipeline for processing. The tiling unit is…

Snoop and replay for completing bus transaction

Granted: April 4, 2017
Patent Number: 9612994
Systems and devices configured to implement techniques for ensuring the completion of transactions while minimizing latency and power consumption are described. A device may be operably coupled to a bidirectional communications bus. A bidirectional communications bus may include a clock line and a data line. The device may be configured to determine if an initiated transaction corresponds to a device in a low power state. The device may pause the transaction. The device may replay…

Method and system for implementing a secure chain of trust

Granted: April 4, 2017
Patent Number: 9613215
A method, an integrated circuit and a system for implementing a secure chain of trust is disclosed. While executing secure boot code in a secure boot mode, less-secure boot code may be authenticated using a secret key. A secure key may also be calculated or generated during the secure boot mode. After control is turned over to the authenticated less-secure boot code, at least one application may be authenticated using the secure key. Once authenticated in the less-secure boot mode, the…

Host context techniques for server based graphics processing

Granted: April 4, 2017
Patent Number: 9613390
The server based graphics processing techniques, describer herein, include receiving function calls by a three dimension graphics application programming interface host-guest communication manager (D3D HGCM) service module from one or more given instances of a guest shin layer through a communication channel of a host-guest communication manager (HGCM). The one or more given instances of the guest shim layer are executing under control of a respective given instance of a guest operating…