Nvidia Patent Applications

SCALABLE LIGHT-WEIGHT PROTOCOLS FOR WIRE-SPEED PACKET ORDERING

Granted: March 24, 2022
Application Number: 20220095017
A communication method between a source device and a target device utilizes speculative connection setup between the source device and the target device, target-device-side packet ordering, and fine-grained ordering to remove packet dependencies.

Efficient Neural Network Accelerator Dataflows

Granted: March 10, 2022
Application Number: 20220076110
A distributed deep neural net (DNN) utilizing a distributed, tile-based architecture includes multiple chips, each with a central processing element, a global memory buffer, and a plurality of additional processing elements. Each additional processing element includes a weight buffer, an activation buffer, and vector multiply-accumulate units to combine, in parallel, the weight values and the activation values using stationary data flows.

EFFICIENT SOFTMAX COMPUTATION

Granted: March 3, 2022
Application Number: 20220067513
Solutions improving efficiency of Softmax computation applied for efficient deep learning inference in transformers and other neural networks. The solutions utilize a reduced-precision implementation of various operations in Softmax, replacing ex with 2x to reduce instruction overhead associated with computing ex, and replacing floating point max computation with integer max computation. Further described is a scalable implementation that decomposes Softmax into UnNormalized Softmax and…

STANDARD CELL LAYOUT GENERATION WITH APPLIED ARTIFICIAL INTELLIGENCE

Granted: January 27, 2022
Application Number: 20220027546
A genetic algorithm is utilized to generate routing candidates to which a reinforcement learning model is applied to correct the design rule constraint violations incrementally. A design rule checker provides feedback on the violations to the reinforcement learning model and the model learns how to fix the violations. A layout device placer based upon a simulated annealing method may also be utilized.

TECHNIQUES FOR DIVERGENT THREAD GROUP EXECUTION SCHEDULING

Granted: January 27, 2022
Application Number: 20220027194
Warp sharding techniques to switch execution between divergent shards on instructions that trigger a long stall, thereby interleaving execution between diverged threads within a warp instead of across warps. The technique may be applied to mitigate pipeline stalls in applications with low warp occupancy and high divergence. Warp data cache locality may also be improved by concentrating memory accesses within a warp rather than spreading them across warps.

FOVEATION AND SPATIAL HASHING IN LAYER-BASED COMPUTER-GENERATED HOLOGRAMS

Granted: January 27, 2022
Application Number: 20220026715
The computational scaling challenges of holographic displays are mitigated by techniques for generating holograms that introduce foveation into a wave front recording planes approach to hologram generation. Spatial hashing is applied to organize the points or polygons of a display object into keys and values.

ADVERSARIAL SCENARIOS FOR SAFETY TESTING OF AUTONOMOUS VEHICLES

Granted: December 16, 2021
Application Number: 20210389769
Techniques to generate driving scenarios for autonomous vehicles characterize a path in a driving scenario according to metrics such as narrowness and effort. Nodes of the path are assigned a time for action to avoid collision from the node. The generated scenarios may be simulated in a computer.

TENSOR-BASED DRIVING SCENARIO CHARACTERIZATION

Granted: December 16, 2021
Application Number: 20210387643
Techniques to characterize driving scenarios for autonomous vehicles characterize a path in a driving scenario according to metrics such as narrowness and effort. The scenarios may be characterized using a tree-based or tensor-based approach.

USE OF STASHING BUFFERS TO IMPROVE THE EFFICIENCY OF CROSSBAR SWITCHES

Granted: November 4, 2021
Application Number: 20210344616
A switch architecture enables ports to stash packets in unused buffers on other ports, exploiting excess internal bandwidth that may exist, for example, in a tiled switch. This architecture leverages unused port buffer memory to improve features such as congestion handling and error recovery.

Adaptive Pixel Sampling Order for Temporally Dense Rendering

Granted: November 4, 2021
Application Number: 20210344944
A method dynamically selects one of a first sampling order and a second sampling order for a ray trace of pixels in a tile where the selection is based on a motion vector for the tile. The sampling order may be a bowtie pattern or an hourglass pattern.

FOVEATED DISPLAY FOR AUGMENTED REALITY

Granted: November 4, 2021
Application Number: 20210341741
An augmented reality display system includes a first beam path for a foveal inset image on a holographic optical element, a second beam path for a peripheral display image on the holographic optical element, and pupil position tracking logic that generates control signals to set a position of the foveal inset as perceived through the holographic optical element, to determine the peripheral display image, and to control a moveable stage.

TECHNIQUES TO IMPROVE CURRENT REGULATOR CAPABILITY TO PROTECT THE SECURED CIRCUIT FROM POWER SIDE CHANNEL ATTACK

Granted: October 28, 2021
Application Number: 20210336536
This disclosure relates to current flattening circuits for an electrical load. The current flattening circuits incorporate randomize various parameters to add noise onto the supply current. This added noise may act to reduce the signal to noise ratio in the supply current, increasing the difficulty of identifying a computational artifact signal from power rail noise.

CURRENT FLATTENING CIRCUIT FOR PROTECTION AGAINST POWER SIDE CHANNEL ATTACKS

Granted: October 28, 2021
Application Number: 20210334411
Various implementations of a current flattening circuit are disclosed, including those utilizing a feedback current regulator, a feedforward current regulator, and a constant current source.

DEEP LEARNING BASED IDENTIFICATION OF DIFFICULT TO TEST NODES

Granted: September 23, 2021
Application Number: 20210295169
Techniques to improve the accuracy and speed for detection and remediation of difficult to test nodes in a circuit design netlist. The techniques utilize improved netlist representations, test point insertion, and trained neural networks.

Circuit Solution for Managing Power Sequencing

Granted: September 23, 2021
Application Number: 20210294410
A circuit includes a supply power detector in a first power domain and a ratioed inverter in the first power domain or a second, different power domain. The supply power detector includes an output coupled to an input of the ratioed inverter, and an output of the ratioed inverter provides a power sequencing signal for the second power domain.

FAST TRIGGERING ELECTROSTATIC DISCHARGE PROTECTION

Granted: September 9, 2021
Application Number: 20210281067
An electrostatic discharge protection circuit is disclosed. It comprises a stacked drain-ballasted NMOS devices structure and a gate bias circuit. The gate bias circuit includes an inverter, a first gate bias output terminal, and a second gate bias output terminal. The first gate bias output terminal is coupled to a gate of a first one of the drain-ballasted NMOS devices. The second gate bias output terminal runs from an output of the inverter to a gate of a second one of the…

ADDRESSING CACHE SLICES IN A LAST LEVEL CACHE

Granted: August 19, 2021
Application Number: 20210255963
“A system in having M memory controllers between a first memory and a second memory having N operative memory slices, where N and M are not evenly divisible, includes logic to operate the M memory controllers to linearly distribute addresses of the second memory across the N operative memory slices. The system may be utilized in commercial applications such as data centers, autonomous vehicles, and machine learning.”

AVERAGE POWER ESTIMATION USING GRAPH NEURAL NETWORKS

Granted: May 27, 2021
Application Number: 20210158155
A graph neural network for average power estimation of netlists is trained with register toggle rates over a power window from an RTL simulation and gate level netlists as input features. Combinational gate toggle rates are applied as labels. The trained graph neural network is then applied to infer combinational gate toggle rates over a different power window of interest and/or different netlist.

LAYOUT PARASITICS AND DEVICE PARAMETER PREDICTION USING GRAPH NEURAL NETWORKS

Granted: May 27, 2021
Application Number: 20210158127
A graph neural network to predict net parasitics and device parameters by transforming circuit schematics into heterogeneous graphs and performing predictions on the graphs. The system may achieve an improved prediction rate and reduce simulation errors.

Data Recovery Technique for Time Interleaved Receiver in Presence of Transmitter Pulse Width Distortion

Granted: May 13, 2021
Application Number: 20210143824
This disclosure relates to a receiver comprising a clock and data recovery loop and a phase offset loop. The clock and data recovery loop may be controlled by a sum of gradients for a plurality of data interleaves. The phase offset loop may be controlled by an accumulated differential gradient for each of the data interleaves.