Dolby Laboratories Patent Applications

OVER-SUPPRESSION MITIGATION FOR DEEP LEARNING BASED SPEECH ENHANCEMENT

Granted: August 29, 2024
Application Number: 20240290341
A system for mitigating over-suppression of speech and other non-noise signals is disclosed. In some embodiments, a system is programmed to train a first machine learning model for speech detection or enhancement using a non-linear, asymmetric loss function that penalizes speech over-suppression more than speech under-suppression. The first machine learning model is configured to receive an audio signal and generate a mask indicating an amount of speech present in the audio signal. The…

RENDERING BINAURAL AUDIO OVER MULTIPLE NEAR FIELD TRANSDUCERS

Granted: August 22, 2024
Application Number: 20240284104
An apparatus and method of rendering audio. A binaural signal is split on an amplitude weighting basis into a front binaural signal and a rear binaural signal, based on perceived position information of the audio. In this manner, the front-back differentiation of the binaural signal is improved.

SIGNAL RESHAPING FOR HIGH DYNAMIC RANGE SIGNALS

Granted: August 22, 2024
Application Number: 20240283983
In a method to improve backwards compatibility when decoding high-dynamic range images coded in a wide color gamut (WCG) space which may not be compatible with legacy color spaces, hue and/or saturation values of images in an image database are computed for both a legacy color space (say, YCbCr-gamma) and a preferred WCG color space (say, IPT-PQ). Based on a cost function, a reshaped color space is computed so that the distance between the hue values in the legacy color space and rotated…

SIGNALING OF PRIORITY PROCESSING ORDER FOR METADATA MESSAGING IN VIDEO CODING

Granted: August 22, 2024
Application Number: 20240283978
Methods, systems, and bitstream syntax are described for determining the processing order of metadata messaging, such as supplemental enhancement information (SEI) messaging in MPEG video coding.

ROTATION-ENABLED HIGH DYNAMIC RANGE VIDEO ENCODING

Granted: August 22, 2024
Application Number: 20240283975
Methods for encoding and decoding video data. One method includes receiving video data, the video data composed of a plurality of image frames, each image frame including a plurality of pixels. The method includes determining, for each image frame, a rotation matrix, determining, for each image frame, at least one of a scaling factor and an offset factor, and determining, for each image frame, a reshaping function based on one or more values of each of the plurality of pixels. The method…

AUDIO DECODER AND DECODING METHOD

Granted: August 22, 2024
Application Number: 20240282323
A method for representing a second presentation of audio channels or objects as a data stream, the method comprising the steps of: (a) providing a set of base signals, the base signals representing a first presentation of the audio channels or objects; (b) providing a set of transformation parameters, the transformation parameters intended to transform the first presentation into the second presentation; the transformation parameters further being specified for at least two frequency…

METHODS AND SYSTEMS FOR GENERATING AND RENDERING OBJECT BASED AUDIO WITH CONDITIONAL RENDERING METADATA

Granted: August 22, 2024
Application Number: 20240282322
Methods and audio processing units for generating an object based audio program including conditional rendering metadata corresponding to at least one object channel of the program, where the conditional rendering metadata is indicative of at least one rendering constraint, based on playback speaker array configuration, which applies to each corresponding object channel, and methods for rendering audio content determined by such a program, including by rendering content of at least one…

MULTICHANNEL AUDIO ENCODE AND DECODE USING DIRECTIONAL METADATA

Granted: August 22, 2024
Application Number: 20240282321
The disclosure relates to methods of processing a spatial audio signal for generating a compressed representation of the spatial audio signal. The methods include analyzing the spatial audio signal to determine directions of arrival for one or more audio elements; for at least one frequency subband, determining respective indications of signal power associated with the directions of arrival; generating metadata including direction information that includes indications of the directions…

AUGMENTED REALITY AND SCREEN IMAGE RENDERING COORDINATION

Granted: August 15, 2024
Application Number: 20240272712
A first image for rendering on a first image display in a combination of a stationary image display and a non-stationary image display is received. A visual object depicted in the first image is identified. A corresponding image portion in a second image is generated for rendering on a second image display in the combination of the stationary image display and the non-stationary image display. The corresponding image portion in the second image as rendered on the second image display…

RENDERING AUDIO OVER MULTIPLE SPEAKERS WITH MULTIPLE ACTIVATION CRITERIA

Granted: August 8, 2024
Application Number: 20240267679
Methods for rendering audio for playback by two or more speakers are disclosed. The audio includes one or more audio signals, each with an associated intended perceived spatial position. Relative activation of the speakers may be a cost function of a model of perceived spatial position of the audio signals when played back over the speakers, a measure of proximity of the intended perceived spatial position of the audio signals to positions of the speakers, and one or more additional…

CHAINED RESHAPING FUNCTION OPTIMIZATION

Granted: August 8, 2024
Application Number: 20240267530
An input image to a pipeline of chained reshaping functions is received. Reference images are generated from the input image. The input image and the reference images are used to determine operational parameters for chained reshaping functions in the pipeline of chained reshaping functions. A reshaped image generated from one or more of the chained reshaping functions is encoded in a video signal along with image metadata. The image metadata includes some or all of the operational…

SYSTEMS AND METHODS FOR COVARIANCE SMOOTHING

Granted: August 8, 2024
Application Number: 20240265927
Methods and systems for improving signal processing by smoothing the covariance matrix of a multi-channel signal by setting a forgetting factor based on the bins of a band. A method and system for resetting the smoothing based on transient detection is also disclosed. A method and system for resampling for the smoothing during a banding transition is also disclosed.

MUSIC SYNTHESIZER WITH SPATIAL METADATA OUTPUT

Granted: August 8, 2024
Application Number: 20240265897
Described are apparatus for generating and/or processing audio signals. One apparatus includes: a first stage for obtaining an audio signal; a second stage for modifying the audio signal based on one or more control signals for shaping sound represented by the audio signal; a third stage for generating spatial metadata related to the modified audio signal, based at least in part on the one or more control signals; and an output stage for outputting the modified audio signal together with…

METHOD AND APPARATUS FOR SCREEN RELATED ADAPTATION OF A HIGHER-ORDER AMBISONICS AUDIO SIGNAL

Granted: August 1, 2024
Application Number: 20240259750
A method for generating loudspeaker signals associated with a target screen size is disclosed. The method includes receiving a bit stream containing encoded higher order ambisonics signals, the encoded higher order ambisonics signals describing a sound field associated with a production screen size. The method further includes decoding the encoded higher order ambisonics signals to obtain a first set of decoded higher order ambisonics signals representing dominant components of the sound…

METHODS AND APPARATUS FOR COMPRESSING AND DECOMPRESSING A HIGHER ORDER AMBISONICS REPRESENTATION

Granted: August 1, 2024
Application Number: 20240259743
Higher Order Ambisonics represents three-dimensional sound independent of a specific loudspeaker set-up. However, transmission of an HOA representation results in a very high bit rate. Therefore, compression with a fixed number of channels is used, in which directional and ambient signal components are processed differently. The ambient HOA component is represented by a minimum number of HOA coefficient sequences. The remaining channels contain either directional signals or additional…

FRAME-RATE SCALABLE VIDEO CODING

Granted: July 25, 2024
Application Number: 20240251093
Methods and systems for frame rate scalability are described. Support is provided for input and output video sequences with variable frame rate and variable shutter angle across scenes, or for input video sequences with fixed input frame rate and input shutter angle, but allowing a decoder to generate a video output at a different output frame rate and shutter angle than the corresponding input values. Techniques allowing a decoder to decode more computationally-efficiently a specific…

SYSTEM AND METHOD FOR NON-DESTRUCTIVELY NORMALIZING LOUDNESS OF AUDIO SIGNALS WITHIN PORTABLE DEVICES

Granted: July 25, 2024
Application Number: 20240249738
Many portable playback devices cannot decode and playback encoded audio content having wide bandwidth and wide dynamic range with consistent loudness and intelligibility unless the encoded audio content has been prepared specially for these devices. This problem can be overcome by including with the encoded content some metadata that specifies a suitable dynamic range compression profile by either absolute values or differential values relative to another known compression profile. A…

MULTI-STEP DISPLAY MAPPING AND METADATA RECONSTRUCTION FOR HDR VIDEO

Granted: July 25, 2024
Application Number: 20240249701
Methods and systems for multi-step display mapping and metadata reconstruction for high-dynamic range (HDR) images are described. In an encoder, given an HDR input image with input HDR metadata in a first dynamic range, an intermediate, base layer image in a second dynamic range is constructed based on the input image. In a decoder, using base-layer metadata, the input HDR metadata, and dynamic range characteristics of a target display, a processor generates reconstructed metadata which…

SIGNAL RESHAPING FOR HIGH DYNAMIC RANGE SIGNALS

Granted: July 18, 2024
Application Number: 20240244271
In a method to improve backwards compatibility when decoding high-dynamic range images coded in a wide color gamut (WCG) space which may not be compatible with legacy color spaces, hue and/or saturation values of images in an image database are computed for both a legacy color space (say, YCbCr-gamma) and a preferred WCG color space (say, IPT-PQ). Based on a cost function, a reshaped color space is computed so that the distance between the hue values in the legacy color space and rotated…

PROJECTOR AND METHOD FOR INCREASING PROJECTED LIGHT INTENSITY

Granted: July 4, 2024
Application Number: 20240219820
A projector includes a light source, an integrating rod, an image panel, a beam shaper, and an actuator mechanically connected to the beam shaper. The image panel is configured to display an image at a displayed aspect ratio. The beam shaper includes multiple prisms shaped and oriented such that when the beam shaper intersects an optical path of the illumination between the integrating rod and the image panel, the illumination transmitted by the beam shaper is collinear with the…