BI-DIRECTIONAL DYNAMIC FUNCTION EXCHANGE
Granted: June 20, 2024
Application Number:
20240202421
Bi-directional dynamic function exchange (DFX) can include receiving a circuit design for a programmable integrated circuit (IC). The circuit design includes a plurality of DFX partitions coupled by a signal path. The circuit design can be placed using a first plurality of DFX modules for the plurality of DFX partitions, in part, by selecting a flip-flop of a connection block as a boundary flip-flop of the signal path for each DFX module of the plurality of DFX modules. The circuit…
PROGRAMMABLE STREAM SWITCHES AND FUNCTIONAL SAFETY CIRCUITS IN INTEGRATED CIRCUITS
Granted: June 13, 2024
Application Number:
20240195418
An integrated circuit (IC) may include a plurality of compute tiles in a data processing array. Each compute tile is configured to perform a data processing function. The IC may include a plurality of interface tiles in the data processing array. The plurality of interface tiles are communicatively linked to the plurality of compute tiles. The IC may include a plurality of programmable stream switches disposed in the plurality of compute tiles and the plurality of interface tiles. The IC…
ALIGNMENT OF MACROS BASED ON ANCHOR LOCATIONS
Granted: June 13, 2024
Application Number:
20240193341
Placement of macros of a circuit design includes mapping the macros to types of sub-circuits of an integrated circuit (IC). The IC includes anchors and instances of each type of the types of sub-circuits. The macros are grouped based on couplings of the macros to the anchors specified in the circuit design. Each group includes one or more macros, and the one or more macros in each group are all coupled to the same set of one or more anchors. A location is selected from alternative…
COMPRESSION OF SPARSE MATRICES FOR VECTOR PROCESSING
Granted: June 13, 2024
Application Number:
20240193227
Partition-level compression of an m×n sparse matrix includes determining in each partition, row and column indices of elements having non-zero values. Each partition has s rows and t columns and s<m and t<n. A group of ordered sets of tuples is generated from the elements and row and column indices in each partition that has at least one non-zero element. Each ordered set includes s tuples, and positions of the s tuples in the ordered set correspond to the s rows of the partition,…
SYNTHESIS FOR MATRIX MULTIPLICATION USING A DATA PROCESSING ARRAY
Granted: June 13, 2024
Application Number:
20240193225
Parameters defining a matrix multiply operation to be implemented in a data processing array can be received. A formulation of the matrix multiply operation is generated based on the parameters. A matrix multiply solution is determined for performing the matrix multiply operation in the data processing array. The matrix multiply solution specifies a spatial and temporal partitioning of the matrix multiply operation for implementation in the data processing array. Synthesizable program…
CLOCKING ARCHITECTURE FOR COMMUNICATING CLOCK SIGNALS HAVING DIFFERENT FREQUENCIES OVER A COMMUNICATION INTERFACE
Granted: June 6, 2024
Application Number:
20240184736
An integrated circuit (IC) device includes a first IC chip, a second IC chip, and a chip-to-chip interface connected between the first IC chip and the second IC chip. The chip-to-chip interface communicates an interface clock signal and a logic clock signal between the first IC chip and the second IC chip. A frequency of the interface clock signal is a multiple of a frequency of the logic clock signal.
MULTI-THREADED CYCLE-ACCURATE ARCHITECTURE SIMULATION
Granted: June 6, 2024
Application Number:
20240184616
A thread manager creates multiple threads by to execute a simulation of subsystems of a system-on-chip on multiple processor cores in response to execution of a simulation program. The threads execute multiple cycle-accurate simulation models of the subsystems in parallel in an execution phase of each simulation cycle of a plurality of simulation cycles of the simulation. The threads update interfaces of the simulation models in an update phase of each simulation cycle of the plurality…
IMPLEMENTING BURST TRANSFERS FOR PREDICATED MEMORY ACCESSES IN LOOP BODIES FOR HIGH-LEVEL SYNTHESIS
Granted: May 30, 2024
Application Number:
20240176936
Implementing burst transfers for predicated accesses in high-level synthesis includes generating, using computer hardware, an intermediate representation of a design specified in a high-level programming language. The design is for an integrated circuit. Using the computer hardware, loop predicate information for one or more conditional statements within a loop body of the intermediate representation is determined. A plurality of memory accesses of the loop body guarded by the one or…
INSTRUCTION PRUNING FOR NEURAL NETWORKS
Granted: May 30, 2024
Application Number:
20240176981
In pruning weights from a neural network (NN), a design tool selects a dt-ds pair from a plurality of dt-ds pairs supported by a target device. Each dt-ds pair specifies a data type, dt, and an associated circuit structure, ds, that is configurable to compute d×s operations in parallel on a set of input activations and a matrix of weights of the data type, d is a number of rows in a sub-matrix of the matrix of weights, s is a number of columns in the sub-matrix, and d×s?1. The design…
Dataflow Based Analysis Guidance to Mapper for Buffers Allocation in Multicore Architectures
Granted: May 30, 2024
Application Number:
20240176942
Providing dataflow based guidance for buffer allocation in a multicore circuit architecture includes converting, using computer hardware, an application specified in a high-level programming language into an intermediate representation. Buffers of dataflows of the intermediate representation are detected. Determining whether the buffers are independent or dependent based on an analysis of the dataflows of the intermediate representation. Buffer constraints are generated. The buffer…
PROGRAMMABLE DATA MOVEMENT PROCESSOR FOR COLLECTIVE COMMUNICATION OFFLOAD
Granted: May 30, 2024
Application Number:
20240176652
A system includes a network-on-chip (NoC). The system includes a protocol offload engine coupled to the NoC. The protocol offload engine is configured to generate packets of data for a selected protocol. The system includes a data movement processor coupled to the network-on-chip. The data movement processor is configured to receive a microcode instruction and, in response to the microcode instruction, establish data paths in the NoC that communicatively link a plurality of circuits…
CLOCK RECOVERY CIRCUIT
Granted: May 2, 2024
Application Number:
20240144897
A clock buffer has a clock-in port that inputs a reference clock and an enable port that inputs a video-clock-enable signal from a video receiver. The clock buffer generates a video pixel clock signal that has pulses of the reference signal as enabled by the video-clock-enable signal. The video receiver includes a link symbol extractor, a link-to-pixel mapper, and a timing generator that work to mirror the actual pixel data rate from the active period in a blanking period and thereby…
REDUCTION TREE HAVING INTERMEDIATE QUANTIZATION BETWEEN REDUCTION OPERATORS
Granted: May 2, 2024
Application Number:
20240143280
A circuit arrangement includes a reduction operator circuits arranged in a first level of a reduction tree. Each reduction operator circuit accumulates respective products into a respective sum. Quantizer circuits are configured to quantize the sums from the reduction operator circuits into quantized sums, respectively, based on values of the sums relative to respective first thresholds. Another reduction operator circuit is arranged in a second level of the reduction tree and is…
ADAPTABLE FRAMEWORK FOR CIRCUIT DESIGN SIMULATION VERIFICATION
Granted: April 25, 2024
Application Number:
20240135074
An adaptable framework for circuit design simulation verification generates a simulation database for a circuit design and processed design data for the circuit design. The processed design data includes source files for the circuit design referenced by the simulation database. The simulation database and the processed design data are exported from a host integrated development environment (IDE). A template writer configured to generate a simulation script for the circuit design using…
SWITCHING BETWEEN REDUNDANT AND NON-REDUNDANT MODES OF SOFTWARE EXECUTION
Granted: April 11, 2024
Application Number:
20240118901
Executing critical and non-critical sections of program code include executing a non-critical section of a first program by a first processor and executing a non-critical section of a second program by a second processor. The first processor signals the second processor with context to commence redundant execution of the critical section. The second processor switches from executing the second program to executing the critical section of the first program. The first processor executes…
MULTIPLIER BLOCK FOR BLOCK FLOATING POINT AND FLOATING POINT VALUES
Granted: April 11, 2024
Application Number:
20240118868
A mode control circuit operates a circuit arrangement in either a first mode to multiply floating point operands or a second mode to compute a dot product of two vectors of block floating point values. A block of multiplier circuits generates products from first pairs of p-terms. Each p-term is a portion of a significand of one of the floating point operands when operating in the first mode, or a significand of one of the block floating point values when operating in the second mode. An…
SATISFYING CIRCUIT DESIGN CONSTRAINTS USING A COMBINATION OF MACHINE LEARNING MODELS
Granted: April 4, 2024
Application Number:
20240111932
Multiple classifier models are applied to features of a circuit design after processing the design through a first phase of an implementation flow. Each classifier model is associated with one of multiple directives, the directives are associated with a second phase of the implementation flow, and each classifier model returns a value indicative of likelihood of improving a quality metric. Regressor models of each set of a plurality of sets of regressor models are applied to the…
IMPLEMENTING DATA FLOWS OF AN APPLICATION ACROSS A MEMORY HIERARCHY OF A DATA PROCESSING ARRAY
Granted: March 21, 2024
Application Number:
20240094944
Implementing data flows of an application across a memory hierarchy of a data processing array includes receiving a data flow graph specifying an application for execution on the data processing array. A plurality of buffer objects corresponding to a plurality of different levels of the memory hierarchy of the data processing array and an external memory are identified. The plurality of buffer objects specify data flows. Buffer object parameters are determined. The buffer object…
MULTIPLE PARTITIONS IN A DATA PROCESSING ARRAY
Granted: March 14, 2024
Application Number:
20240088900
An apparatus includes a data processing array having a plurality of array tiles. The plurality of array tiles include a plurality of compute tiles. The compute tiles include a core coupled to a random-access memory (RAM) in a same compute tile and to a RAM of at least one other compute tile. The data processing array is subdivided into a plurality of partitions. Each partition includes a plurality of array tiles including at least one of the plurality of compute tiles. The apparatus…
FRACTIONAL LOGARITHMIC NUMBER SYSTEM ADDER
Granted: February 29, 2024
Application Number:
20240069865
An adder for fractional logarithmic number system (FLNS) format operands includes a compare-and-swap circuit that inputs first and second FLNS operands represented by fixed point values and provides a greater one as operand x and a lesser or equal one as operand y. Sign bits are sx and sy of x and y, respectively, qx and qy, are integer portions of x and y, respectively, fraction portions of x and y have integer values rx and ry, respectively. The compare-and-swap circuit is configured…