Google Patent Applications - Company Legal Profiles

Integrated Access Backhaul with an Adaptive Phase-Changing Device

Granted: February 1, 2024
Application Number: 20240039608
Techniques and apparatuses are described for integrated access backhaul with an adaptive phase-changing device (APD) are described. In aspects, a donor base station determines to include an APD in a communication path for the wireless backhaul link with a node base station and apportions APD access to the APD for the node base station. The donor base station then communicates with the node base station using the surface of the APD and based on the apportioned APD-access by using the…

Pin Fin Placement Assembly for Forming Temperature Control Element Utilized in Device Die Packages

Granted: February 1, 2024
Application Number: 20240038620
A pin fin placement assembly utilized to form pin fins in a thermal dissipating feature is provided. The pin fin placement assembly may place the pin fins on an IC die disposed in the IC package. The pin fin placement assembly may assist massively placing the pin fins with desired profiles and numbers on desired locations of the IC die. The plurality of pin fins is formed in a first plurality of apertures in the pin fin placement assembly. A thermal process is then performed to solder…

Neural Networks For Speaker Verification

Granted: February 1, 2024
Application Number: 20240038245
This document generally describes systems, methods, devices, and other techniques related to speaker verification, including (i) training a neural network for a speaker verification model, (ii) enrolling users at a client device, and (iii) verifying identities of users based on characteristics of the users' voices. Some implementations include a computer-implemented method. The method can include receiving, at a computing device, data that characterizes an utterance of a user of the…

Attention-Based Clockwork Hierarchical Variational Encoder

Granted: February 1, 2024
Application Number: 20240038214
A method for representing an intended prosody in synthesized speech includes receiving a text utterance having at least one word, and selecting an utterance embedding for the text utterance. Each word in the text utterance has at least one syllable and each syllable has at least one phoneme. The utterance embedding represents an intended prosody. For each syllable, using the selected utterance embedding, the method also includes: predicting a duration of the syllable by decoding a…

SYSTEMS, METHODS, AND DEVICES FOR ACTIVITY MONITORING VIA A HOME ASSISTANT

Granted: February 1, 2024
Application Number: 20240038037
The various implementations described herein include methods, devices, and systems for monitoring activity in a home environment. In one aspect, a method performed at a voice-assistant device includes: detecting a sound; obtaining a determination as to whether the sound meets one or more monitoring criteria; and in accordance with a determination that the sound meets the one or more monitoring criteria, generating a notification.

Efficient Storage and Query of Schemaless Data

Granted: February 1, 2024
Application Number: 20240037146
A method of storing semi-structured data includes receiving user data from a user of a query system where the user data includes semi-structured user data. The method also includes receiving an indication that the semi-structured user data fails to include a fixed schema. In response to the indication that the semi-structured user data fails to include the fixed schema, the method further includes parsing the semi-structured user data into a plurality of data paths and extracting a data…

ATTENTIVE SCORING FUNCTION FOR SPEAKER IDENTIFICATION

Granted: January 25, 2024
Application Number: 20240029742
A speaker verification method includes receiving audio data corresponding to an utterance, processing the audio data to generate a reference attentive d-vector representing voice characteristics of the utterance, the evaluation ad-vector includes ne style classes each including a respective value vector concatenated with a corresponding routing vector. The method also includes generating using a self-attention mechanism, at least one multi-condition attention score that indicates a…

Smart-Device-Based Radar System Performing Angular Position Estimation

Granted: January 25, 2024
Application Number: 20240027600
Techniques and apparatuses are described that implement a smart-device-based radar system capable of performing angular position estimation. A machine-learned module analyzes complex range data generated to estimate angular positions of objects. The machine-learned module is implemented using a multi-stage architecture. In a local stage, the machine-learned module splits the complex range data into different range intervals and separately processes subsets of the complex range data using…

Voice Query QoS based on Client-Computed Content Metadata

Granted: January 25, 2024
Application Number: 20240029740
A method includes receiving an automated speech recognition (ASR) request from a user device that includes a speech input captured by the user device and content metadata associated with the speech input. The content metadata is generated by the user device. The method also includes determining a priority score for the ASR request based on the content metadata associated with the speech input and caching the ASR request in a pre-processing backlog of pending ASR requests each having a…

Unified End-To-End Speech Recognition And Endpointing Using A Switch Connection

Granted: January 25, 2024
Application Number: 20240029719
A single E2E multitask model includes a speech recognition model and an endpointer model. The speech recognition model includes an audio encoder configured to encode a sequence of audio frames into corresponding higher-order feature representations, and a decoder configured to generate probability distributions over possible speech recognition hypotheses for the sequence of audio frames based on the higher-order feature representations. The endpointer model is configured to operate…

Flickering Reduction with Partial Hypothesis Re-ranking for Streaming ASR

Granted: January 25, 2024
Application Number: 20240029718
A method includes processing, using a speech recognizer, a first portion of audio data to generate a first lattice, and generating a first partial transcription for an utterance based on the first lattice. The method includes processing, using the recognizer, a second portion of the data to generate, based on the first lattice, a second lattice representing a plurality of partial speech recognition hypotheses for the utterance and a plurality of corresponding speech recognition scores.…

DETERMINATION OF USER PRESENCE AND ABSENCE USING WIFI CONNECTIONS

Granted: January 25, 2024
Application Number: 20240031847
Systems and techniques are provided for determination of user presence and absence using WiFi connections. Reports may be received from WiFi access points in an environment. The reports may include an identifier of a WiFi device, an indication of a connection to or disconnection from a WiFi access point, a time of the connection or disconnection, and an identifier of the WiFi access point. A connection sequence for the WiFi device may be generated from the reports. Whether the WiFi…

ATTENTIVE SCORING FUNCTION FOR SPEAKER IDENTIFICATION

Granted: January 25, 2024
Application Number: 20240029742
A speaker verification method includes receiving audio data corresponding to an utterance, processing the audio data to generate a reference attentive d-vector representing voice characteristics of the utterance, the evaluation ad-vector includes ne style classes each including a respective value vector concatenated with a corresponding routing vector. The method also includes generating using a self-attention mechanism, at least one multi-condition attention score that indicates a…

Voice Query QoS based on Client-Computed Content Metadata

Granted: January 25, 2024
Application Number: 20240029740
A method includes receiving an automated speech recognition (ASR) request from a user device that includes a speech input captured by the user device and content metadata associated with the speech input. The content metadata is generated by the user device. The method also includes determining a priority score for the ASR request based on the content metadata associated with the speech input and caching the ASR request in a pre-processing backlog of pending ASR requests each having a…

Unified End-To-End Speech Recognition And Endpointing Using A Switch Connection

Granted: January 25, 2024
Application Number: 20240029719
A single E2E multitask model includes a speech recognition model and an endpointer model. The speech recognition model includes an audio encoder configured to encode a sequence of audio frames into corresponding higher-order feature representations, and a decoder configured to generate probability distributions over possible speech recognition hypotheses for the sequence of audio frames based on the higher-order feature representations. The endpointer model is configured to operate…

Flickering Reduction with Partial Hypothesis Re-ranking for Streaming ASR

Granted: January 25, 2024
Application Number: 20240029718
A method includes processing, using a speech recognizer, a first portion of audio data to generate a first lattice, and generating a first partial transcription for an utterance based on the first lattice. The method includes processing, using the recognizer, a second portion of the data to generate, based on the first lattice, a second lattice representing a plurality of partial speech recognition hypotheses for the utterance and a plurality of corresponding speech recognition scores.…

Streaming Automatic Speech Recognition With Non-Streaming Model Distillation

Granted: January 25, 2024
Application Number: 20240029716
A method for training a streaming automatic speech recognition student model includes receiving a plurality of unlabeled student training utterances. The method also includes, for each unlabeled student training utterance, generating a transcription corresponding to the respective unlabeled student training utterance using a plurality of non-streaming automated speech recognition (ASR) teacher models. The method further includes distilling a streaming ASR student model from the plurality…

Using Aligned Text and Speech Representations to Train Automatic Speech Recognition Models without Transcribed Speech Data

Granted: January 25, 2024
Application Number: 20240029715
A method includes receiving training data that includes unspoken textual utterances in a target language. Each unspoken textual utterance not paired with any corresponding spoken utterance of non-synthetic speech. The method also includes generating a corresponding alignment output for each unspoken textual utterance using an alignment model trained on transcribed speech utterance in one or more training languages each different than the target language. The method also includes…

DEVICES AND METHODS FOR A SPEECH-BASED USER INTERFACE

Granted: January 25, 2024
Application Number: 20240029706
A device may identify a plurality of sources for outputs that the device is configured to provide. The plurality of sources may include at least one of a particular application in the device, an operating system of the device, a particular area within a display of the device, or a particular graphical user interface object. The device may also assign a set of distinct voices to respective sources of the plurality of sources. The device may also receive a request for speech output. The…

Joint Speech and Text Streaming Model for ASR

Granted: January 25, 2024
Application Number: 20240028829
A method includes receiving training data that includes a set of unspoken textual utterances. For each respective unspoken textual utterance, the method includes, tokenizing the respective textual utterance into a sequence of sub-word units, generating a first higher order textual feature representation for a corresponding sub-word unit tokenized from the respective unspoken textual utterance, receiving the first higher order textual feature representation generated by a text encoder,…