Google Patent Applications

Spatial Audio Communication Between Devices with Speaker Array and/or Microphone Array

Granted: September 28, 2023
Application Number: 20230308825
The technology generally relates to spatial audio communication between devices. For example, a first device and a second device may be connected via a communication link. The first device may capture audio signals in an environment through two or more microphones. The first device may encode the captured audio with direction information. The first device may transmit the encoded audio via the communication link to the second device. The second device may decode the encoded audio to be…

Automatic Detection and Mitigation of Denial-of-Service Attacks

Granted: September 28, 2023
Application Number: 20230308476
A method for mitigating network abuse includes obtaining a first set of network traffic messages of network traffic currently received by a network service and determining, via a first model, whether network abuse is occurring based on the first set of network traffic messages. When the network abuse is occurring, the method includes obtaining a second set of current network traffic messages. The method also includes, for each network traffic message in the second set of network traffic…

Speech Recognition Using Word or Phoneme Time Markers Based on User Input

Granted: September 28, 2023
Application Number: 20230306965
A method for separating target speech from background noise contained in an input audio signal includes receiving the input audio signal captured by a user device, wherein the input audio signal corresponds to target speech of multiple words spoken by a target user and containing background noise in the presence of the user device while the target user spoke the multiple words in the target speech. The method also includes receiving a sequence of time markers input by the target user in…

Streaming End-to-end Multilingual Speech Recognition with Joint Language Identification

Granted: September 28, 2023
Application Number: 20230306958
A method includes receiving a sequence of acoustic frames as input to an automatic speech recognition (ASR) model. The method also includes generating, by a first encoder, a first higher order feature representation for a corresponding acoustic frame. The method also includes generating, by a second encoder, a second higher order feature representation for a corresponding first higher order feature representation. The method also includes generating, by a language identification (ID)…

Systems and Methods of Detecting and Responding to a Visitor to a Smart Home Environment

Granted: September 28, 2023
Application Number: 20230306826
A method of detecting and responding to a visitor to a smart home environment via an electronic greeting system of the smart home environment, including determining that a visitor is approaching an entryway of the smart home environment; initiating a facial recognition operation while the visitor is approaching the entryway; initiating an observation window in response to the determination that a visitor is approaching the entryway; obtaining context information from one or more sensors…

LABEL PROPAGATION IN A DISTRIBUTED SYSTEM

Granted: September 28, 2023
Application Number: 20230306060
Data are maintained in a distributed computing system that describe a graph. The graph represents relationships among items. The graph has a plurality of vertices that represent the items and a plurality of edges connecting the plurality of vertices. At least one vertex of the plurality of vertices includes a set of label values indicating the at least one vertex's strength of association with a label from a set of labels. The set of labels describe possible characteristics of an item…

QUERY RESTARTABILITY

Granted: September 28, 2023
Application Number: 20230306028
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for restarting a query using a token. One of the methods includes receiving, by a computer from a requesting device, a query; determining, using a data storage system, a current result responsive to the query; generating, using the current result, a restart token that represents operations performed to determine a plurality of results responsive to the query including the current result…

Thermal Gradient Battery Monitoring System and Methods

Granted: September 28, 2023
Application Number: 20230304951
A battery pack includes a battery, a first temperature sensor configured to provide a first temperature value associated with a temperature of the battery, a heat source disposed proximate to the battery and configured to heat the battery, a second temperature sensor configured to provide a second temperature value associated with a temperature of the heat source, and a control board coupled to the first temperature sensor and the second temperature sensor, wherein the control board is…

UNIVERSAL HAND CONTROLLER

Granted: September 28, 2023
Application Number: 20230305630
Techniques of controlling electronic devices using gestures use a wearable device on a user which translates, via a model, user movements into signals that both identify an electronic device to be controlled and a specific action to take with regard to that electronic device. The wearable device includes an inertial measurement unit (IMU) sensor and a photoplethysmography (PPG) sensor and measure six degrees of freedom (6DOF). The model is a convolutional neural network (CNN) that takes…

GARBAGE COLLECTION FOR DATA STORAGE

Granted: September 28, 2023
Application Number: 20230305733
Methods, systems, apparatus, including computer programs encoded on computer storage media, for reclaiming storage space in a storage environment. In one aspect, the method includes actions of aggregating data that is indicative of access to one or more data objects, determining a future storage cost associated with each of a plurality of data objects, determining an access window for each of the plurality of data objects, identifying a data object based on (i) the future storage cost…

Microphone Array Configuration Invariant, Streaming, Multichannel Neural Enhancement Frontend for Automatic Speech Recognition

Granted: September 21, 2023
Application Number: 20230298612
A multichannel neural frontend speech enhancement model for speech recognition includes a speech cleaner, a stack of self-attention blocks each having a multi-headed self attention mechanism, and a masking layer. The speech cleaner receives, as input, a multichannel noisy input signal and a multichannel contextual noise signal, and generates, as output, a single channel cleaned input signal. The stack of self-attention blocks receives, as input, at an initial block of the stack of…

Optimal Time-to-Event Modeling for Longitudinal Prediction fo Open Entitles

Granted: September 21, 2023
Application Number: 20230297899
A method for optimal time-to-event (TTE) modeling includes obtaining a forecast request requesting performance of a TTE forecast forecasting an amount of time an event will occur after a starting point in time. The method includes obtaining a cutoff value representing an amount of time after the starting point in time that the event has not occurred. The method also includes forecasting, using an uncertainty forecasting model, the amount of time the event will occur after the starting…

Microphone Array Configuration Invariant, Streaming, Multichannel Neural Enhancement Frontend for Automatic Speech Recognition

Granted: September 21, 2023
Application Number: 20230298612
A multichannel neural frontend speech enhancement model for speech recognition includes a speech cleaner, a stack of self-attention blocks each having a multi-headed self attention mechanism, and a masking layer. The speech cleaner receives, as input, a multichannel noisy input signal and a multichannel contextual noise signal, and generates, as output, a single channel cleaned input signal. The stack of self-attention blocks receives, as input, at an initial block of the stack of…

Generalized Automatic Speech Recognition for Joint Acoustic Echo Cancellation, Speech Enhancement, and Voice Separation

Granted: September 21, 2023
Application Number: 20230298609
A method for training a generalized automatic speech recognition model for joint acoustic echo cancellation, speech enhancement, and voice separation includes receiving a plurality of training utterances paired with corresponding training contextual signals. The training contextual signals include a training contextual noise signal including noise prior to the corresponding training utterance, a training reference audio signal, and a training speaker vector including voice…

End-to-End Streaming Keyword Spotting

Granted: September 21, 2023
Application Number: 20230298576
A method for training hotword detection includes receiving a training input audio sequence including a sequence of input frames that define a hotword that initiates a wake-up process on a device. The method also includes feeding the training input audio sequence into an encoder and a decoder of a memorized neural network. Each of the encoder and the decoder of the memorized neural network include sequentially-stacked single value decomposition filter (SVDF) layers. The method further…

Freeze Words

Granted: September 21, 2023
Application Number: 20230298575
A method for detecting freeze words includes receiving audio data that corresponds to an utterance spoken by a user and captured by a user device associated with the user. The method also includes processing, using a speech recognizer, the audio data to determine that the utterance includes a query for a digital assistant to perform an operation. The speech recognizer is configured to trigger endpointing of the utterance after a predetermined duration of non-speech in the audio data.…

Scalable Model Specialization Framework for Speech Model Personalization

Granted: September 21, 2023
Application Number: 20230298574
A method for speech conversion includes obtaining a speech conversion model configured to convert input utterances of human speech directly into corresponding output utterances of synthesized speech. The method further includes receiving a speech conversion request including input audio data corresponding to an utterance spoken by a target speaker associated with atypical speech and a speaker identifier uniquely identifying the target speaker. The method includes activating, using the…

Rare Word Recognition with LM-aware MWER Training

Granted: September 21, 2023
Application Number: 20230298570
A method includes generating, using an audio encoder, a higher-order feature representation for each acoustic frame in a sequence of acoustic frames; generating, using a decoder, based on the higher-order feature representation, a plurality of speech recognition hypotheses, each hypotheses corresponding to a candidate transcription of an utterance and having an associated first likelihood score; generating, using an external language model, for each speech recognition hypothesis, a…

4-bit Conformer with Accurate Quantization Training for Speech Recognition

Granted: September 21, 2023
Application Number: 20230298569
A method for training a model includes obtaining a plurality of training samples. Each respective training sample of the plurality of training samples includes a respective speech utterance and a respective textual utterance representing a transcription of the respective speech utterance. The method includes training, using quantization aware training with native integer operations, an automatic speech recognition (ASR) model on the plurality of training samples. The method also includes…

Using Non-Parallel Voice Conversion for Speech Conversion Models

Granted: September 21, 2023
Application Number: 20230298565
A method includes receiving a set of training utterances each including a non-synthetic speech representation of a corresponding utterance, and for each training utterance, generating a corresponding synthetic speech representation by using a voice conversion model. The non-synthetic speech representation and the synthetic speech representation form a corresponding training utterance pair. At each of a plurality of output steps for each training utterance pair, the method also includes…