DETAILED ACTION
This action is in response to the application filed 03/28/2024. Claims 1 – 20 are pending and have
been examined.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1, 3, 4, 7, 8, 18 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Marman et al. (U.S. Pub. No. 2012/0062732, hereinafter “Marman”) in view of Steinberg et al. (U.S. Pub. No. 2016/0065861, hereinafter “Steinberg”) and Cai et al. (U.S. Patent 11,741,918, hereinafter “Cai”).
Regarding Claim 1, Marman teaches
A system (see Marman Abstract, system), comprising:
a display component associated with a user device (see Marman Figure 1, display 280 in user’s computer 320); and
a control circuit coupled to the display component (see Marman Paragraph [0037], display management module), the control circuit to:
receive a mixed media data signal from an external device (see Marman Paragraph [0038], When an object of interest is present in the scene, video analytics 120 send to display management module 340 the metadata corresponding to location information of the object of interest. Display management module 340 also receives the first set of high resolution image data produced by imager 115. The first set of image data may be sent to display management module 340 from video analytics 120, imager 115, or data storage system 255, Paragraph [0048], display management module 340 and a storage device 390 of data storage system 255 are remote from camera 110, and Paragraph [0020], camera 110 is a megapixel video camera including a high resolution megapixel imager 115 implemented with an advanced pixel architecture for capturing images of a field of view of camera 110);
create a video frame from the mixed media data signal (see Marman Paragraph [0038], the metadata sent from video analytics 120 are synchronized, frame by frame, to the first set of image data to enable display management module 340 to generate a video display window that zooms in on and tracks the object);
delineate a person in the video frame from a background with a bounding box (see Marman Paragraph [0063] and Figure 7, Image 700 contains representations of a first object 720 (a first person) and a second object 730 (a second person) captured in the scene. Video analytics 120 detect first object 720 and second object 730 and recognize them as objects of interest. Video analytics 120 may also recognize objects 720 and 730 as humans and generate corresponding object classification metadata. When video analytics 120 detect first and second objects 720 and 730, video analytics 120 generate bounding boxes 740 and 750 surrounding, respectively, first object 720 and second object 730. Bounding boxes 740 and 750 correspond to location information of first and second objects 720 and 730, therefore the bounding boxes encapsulate the person(s) in the video, and the bounding boxes separate the person from the remainder of the image (or background));
Marman does not expressively teach
determine that the background is not meaningful;
wherein the video frame comprises, for a plurality of pixels, a respective original pixel intensity;
identify pixels of the plurality of pixels that are associated with the background as background pixels;
for individual background pixels, determine a respective dimming factor that reduces the respective original pixel intensity, wherein the dimming factor is a function of distance from the bounding box and ranges from a minimum amount adjacent to the bounding box to a maximum amount at an edge of the video frame;
create a dimmed background by applying to the individual background pixels, the respective dimming factor; and
cause the display component to display a converted video frame with the dimmed background.
However, Steinberg teaches
determine that the background is not meaningful (see Steinberg Paragraph [0042], The method may also include determining a relevance or importance, or both, of the foreground region or the background region, or both);
identify pixels of the plurality of pixels that are associated with the background as background pixels (see Steinberg Paragraph [0045], The method includes identifying within a digital image acquisition device one or more groups of pixels that correspond to a background region or a foreground region, or both, within an original digitally-acquired image);
It would have been obvious to one of ordinary skill in the art before the effective filing date of
the claimed invention to combine the teaching of a system where a user device receives mixed media data, generates a video from it, and distinguishes a person in the video using a bounding box (as taught in Marman), with a system that analyzes video to determine meaningfulness of a background and identifies the corresponding pixels as background pixels (as taught in Steinberg), the motivation being to provide a more accurate subject isolation and context awareness in a video (see Steinberg Paragraph [0016] and [0107]).
Marman in view of Steinberg does not expressively teach
wherein the video frame comprises, for a plurality of pixels, a respective original pixel intensity;
for individual background pixels, determine a respective dimming factor that reduces the respective original pixel intensity, wherein the dimming factor is a function of distance from the bounding box and ranges from a minimum amount adjacent to the bounding box to a maximum amount at an edge of the video frame;
create a dimmed background by applying to the individual background pixels, the respective dimming factor; and
cause the display component to display a converted video frame with the dimmed background.
However, Cai teaches
wherein the video frame comprises, for a plurality of pixels, a respective original pixel intensity (see Cai Column 10, lines 61 – 64, Dimming circuitry 110 may receive image data for a given display frame. The image data (sometimes referred to as pixel data, pixel brightness data, initial brightness values, etc.) includes a brightness value for each pixel in the display, and Column 12, lines 16 – 17, Each pixel may have an initial brightness value in the image data for a given display frame);
for individual background pixels, determine a respective dimming factor that reduces the respective original pixel intensity, wherein the dimming factor is a function of distance from the bounding box and ranges from a minimum amount adjacent to the bounding box to a maximum amount at an edge of the video frame (see Cai Abstract, The vignetting effect causes the display to have a light-emitting area that gradually fades to a black, non-light-emitting area. The vignetting effect allows for the size and shape of the light-emitting area to be controlled while still being aesthetically pleasing to a viewer. To implement a vignetting mask for the vignetting effect, control circuitry in the electronic device may apply dimming factors to image data for the display. To avoid artifacts caused by the vignetting mask, an initial array of dimming factors for implementing the vignetting mask may have a range between −0.5 and 1.5. After upsampling, clamping may be performed to reduce the range of the dimming factors to between 0 and 1. Ultimately, the dimming factors are applied to image data for each frame to implement the vignetting mask in each frame, Column 1, lines 39 – 42, To implement a vignetting mask for the vignetting effect, control circuitry in the electronic device may apply dimming factors to image data for the display. The dimming factors may cause the gradual fade desired for the vignetting effect, Column 1, lines 43 - 47, an initial array of dimming factors for implementing the vignetting mask may have a range between −0.5 and 1.5. The initial array may include dimming factors for only a subset of pixels in the display, and Figure 4, pixels that have a dimming factor applied to them are outside the central emphasis region or light-emitting area 52, therefore background pixels when the area 52 is the visible area of the display that changes according to the viewer’s gaze (Column 6, lines 62 – 64));
create a dimmed background by applying to the individual background pixels, the respective dimming factor (see Cai Column 12, lines 17 – 27, The dimming factor for that respective pixel may be applied to the initial brightness value for that respective pixel. The dimming factor may be multiplied by the initial brightness value or other processing (e.g., another function) may be used to apply the dimming factor to the initial brightness value. The result is a modified brightness value for the respective pixel. The modified image data (with a modified brightness value for each pixel) is provided to the display (e.g., display driver circuitry) and displayed on the array of pixels, and Figure 4, pixels that have a dimming factor applied to them are outside the central emphasis region or light-emitting area 52, therefore background pixels when the area 52 is the visible area of the display that changes according to the viewer’s gaze (Column 6, lines 62 – 64)); and
cause the display component to display a converted video frame with the dimmed background (see Cai Column 12, lines 17 – 27, The dimming factor for that respective pixel may be applied to the initial brightness value for that respective pixel. The dimming factor may be multiplied by the initial brightness value or other processing (e.g., another function) may be used to apply the dimming factor to the initial brightness value. The result is a modified brightness value for the respective pixel. The modified image data (with a modified brightness value for each pixel) is provided to the display (e.g., display driver circuitry) and displayed on the array of pixels, and Figure 4, pixels that have a dimming factor applied to them are outside the central emphasis region or light-emitting area 52, therefore background pixels when the area 52 is the visible area of the display that changes according to the viewer’s gaze (Column 6, lines 62 – 64)).
It would have been obvious to one of ordinary skill in the art before the effective filing date of
the claimed invention to combine the teaching of a system where a user device receives mixed media data, generates a video from it, and distinguishes a person in the video using a bounding box (as taught in Marman), with a system that analyzes video to determine meaningfulness of a background and identifies the corresponding pixels as background pixels (as taught in Steinberg), the motivation being to provide a more accurate subject isolation and context awareness in a video (see Steinberg Paragraph [0016] and [0107]).
It would have been further obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of a system where a user device receives mixed media data, generates a video from it, and distinguishes a person in the video using a bounding box, analyzes video to determine meaningfulness of a background, and identifies the corresponding pixels as background pixels (as taught in Marman in view of Steinberg), with a system that dims background pixels in a video frame by applying a distance based dimming factor and displays the resulting video with a dimmed background (as taught in Cai), the motivation being to provide a strong focus on a video subject by gradually fading the surrounding screen area to black, and improve power conservation in the device (see Cai Column 4, lines 38 – 42, Column 7, lines 2 – 5, and Column 1, lines 32 - 38).
Regarding Claim 3, Marman in view of Steinberg and Cai teaches
The system of claim 1, wherein the dimming factor further has a rate of change that is at least 20% higher near a periphery of the video frame than near the bounding box (see Cai Column 5, lines 51 – 65, FIG. 5 is a zoomed in view of the border between the light-emitting area 52 and the non-light-emitting area 54 in display 14 of FIG. 4. As shown, the light-emitting area 52 includes pixels that are capable of emitting light at full brightness (e.g., 100% brightness). The pixels in non-light-emitting area 54 are turned off and therefore have brightness values capped at 0%. The full brightness pixels in light-emitting area 52 may be referred to as having a dimming factor of 1 (or 100%). The zero brightness pixels in non-light-emitting area 54 may be referred to as having a dimming factor of 0 (or 0%). The dimming factor for each pixel may be applied to (e.g., multiplied by) an initial brightness for that pixel to obtain a final brightness value for the pixel, as applied to the bounding boxes of Marman).
Regarding Claim 4, Marman in view of Steinberg and Cai teaches
The system of claim 1, wherein the control circuit is further configured to:
perform object recognition on the background (see Marman Paragraph [0023], Video analytics 120 use the first set of image data to carry out various functions such as, but not limited to, object detection, classification, tracking, indexing, and search. To perform these various functions, video analytics 120 include a number of engines or modules that enable detection, classification, and tracking of objects present in the scene based on analysis of first set of the image data); and
determine that the background is not meaningful (see Steinberg Paragraph [0042], The method may also include determining a relevance or importance, or both, of the foreground region or the background region, or both) by classifying an output from the object recognition or comparing an output from the object recognition to contents in a lookup table (see Marman Paragraph [0028], Video analytics 120 include a temporal object classification module 230 that is operable to classify an object according to its type (e.g., human, vehicle, animal, an object of interest) by considering the object's appearance over time).
Regarding Claim 7, Marman in view of Steinberg and Cai teaches
The system of claim 1, wherein the display component comprises an organic light emitting diode (OLED) display panel (see Cai Column 4, lines 22 – 27, Display 14 may be an organic light-emitting diode display, a liquid crystal display, an electrophoretic display, an electrowetting display, a plasma display, a microelectromechanical systems display, a display having a pixel array formed from crystalline semiconductor light-emitting diode dies (sometimes referred to as microLEDs)).
Regarding Claim 8, Marman in view of Steinberg and Cai teaches
The system of claim 1, wherein the display component comprises a light emitting diode (LED) display panel (see Cai Column 4, lines 22 – 27, Display 14 may be an organic light-emitting diode display, a liquid crystal display, an electrophoretic display, an electrowetting display, a plasma display, a microelectromechanical systems display, a display having a pixel array formed from crystalline semiconductor light-emitting diode dies (sometimes referred to as microLEDs)).
Regarding Claim 18, it is rejected similarly as Claim 1. The system can be found in Marman (Abstract, system).
Regarding Claim 19, it is rejected similarly as Claim 3. The system can be found in Marman (Abstract, system).
Claim 2 is rejected under 35 U.S.C. 103 as being unpatentable over Marman et al. (U.S. Pub. No. 2012/0062732, hereinafter “Marman”) in view of Steinberg et al. (U.S. Pub. No. 2016/0065861, hereinafter “Steinberg”), Cai et al. (U.S. Patent 11,741,918, hereinafter “Cai”) and Ozawa et al. (U.S. Pub. No. 2025/0182234, hereinafter “Ozawa”).
Regarding Claim 2, Marman in view of Steinberg and Cai teaches
wherein the respective dimming factor varies non-linearly as a function of distance from the bounding box (see Cai Column 1, lines 43 – 49, To avoid artifacts caused by the vignetting mask, an initial array of dimming factors for implementing the vignetting mask may have a range between −0.5 and 1.5. The initial array may include dimming factors for only a subset of pixels in the display. Upsampling (e.g., bilinear interpolation) may be performed to obtain dimming factors for the remaining pixels in the display).
Marman in view of Steinberg and Cai does not expressively teach
wherein the bounding box may be one of an n-sided polygon, a curved shape, generated by depth information, or a combination thereof
However, Ozawa teaches
wherein the bounding box may be one of an n-sided polygon, a curved shape, generated by depth information, or a combination thereof (see Ozawa Paragraph [0036], The bounding shape may be any suitable shape (e.g., a circle, a box, a square, a rectangle, a polygon, an ellipse, or any other suitable shape, or any combination thereof));
It would have been obvious to one of ordinary skill in the art before the effective filing date of
the claimed invention to combine the teaching of a system where a user device receives mixed media data, generates a video from it, and distinguishes a person in the video using a bounding box, analyzes video to determine meaningfulness of a background, and identifies the corresponding pixels as background pixels, in which the system dims background pixels in a video frame by applying a distance based dimming factor and displays the resulting video with a dimmed background (as taught in Marman in view of Steinberg and Cai), with implementing an n-sided polygon bounding box (as taught in Ozawa), the motivation being to implement a bounding box that is fitted to the object of interest to accentuate or emphasize the object thus creating a more accurate bounding box with a stronger differentiation (see Ozawa Paragraph [0036]).
Claim 5 is rejected under 35 U.S.C. 103 as being unpatentable over Marman et al. (U.S. Pub. No. 2012/0062732, hereinafter “Marman”) in view of Steinberg et al. (U.S. Pub. No. 2016/0065861, hereinafter “Steinberg”), Cai et al. (U.S. Patent 11,741,918, hereinafter “Cai”) and Zhao et al. (U.S. Pub. No. 2023/0222669, hereinafter “Zhao”).
Regarding Claim 5, Marman in view of Steinberg and Cai teaches all the limitations of claim 1, but does not expressively teach
The system of claim 1, further comprising:
an artificial intelligence (AI) model that has been trained with training data defining meaningful objects or text;
wherein the control circuit is further configured to:
supply the background to the AI model; and
determine that the background is not meaningful based on an output of the AI model.
However, Zhao teaches
The system of claim 1, further comprising:
an artificial intelligence (AI) model that has been trained with training data defining meaningful objects or text (see Zhao Paragraph [0011], Particular embodiments for training the machine-learning model may comprise detecting, using a boundary detection algorithm, a first boundary of an object of interest in the first segmentation mask, and detecting, using the boundary detection algorithm, a second boundary of the object of interest in a ground truth segmentation mask associated with the first image. The computing system may determine a set of boundary pixel locations corresponding to the first boundary and the second boundary. The computing system may compare the first segmentation mask to the ground truth segmentation mask, wherein differences at the set of boundary pixel locations are weighted more relative to differences at other pixel locations. The system may update the machine-learning model based on the comparison and Paragraph [0019], In the video calling context, the image-segmentation model may be trained to separate humans from their background. For an application designed to detect pets and provide augmented information (e.g., suggestions for caring for dogs vs. cats), the image-segmentation model may be trained to determine if a given pixel depicts a dog, cat, bird, etc. Image segmentation may also be used in a mixed-reality context, where images of real persons or objects are merged with virtual objects with proper occlusion);
wherein the control circuit (see Zhao Paragraph [0049], processor) is further configured to:
supply the background to the AI model (see Zhao Paragraph [0018], An image-segmentation task may be designed to process a given image, which could be an image frame in a video, and determine whether each pixel in the image corresponds to the foreground or background, or to particular objects of interests (e.g., humans, pets, furniture, etc.). In particular embodiments, such determination may take the form of a segmentation mask. A segmentation mask, in particular embodiments, may be implemented as an array (or matrix) of values, where each value is associated with one or more groups of pixels in the input image. For example, an input image with n×m pixels may have a corresponding segmentation mask with n×m values. The value associated with each pixel may indicate a likelihood of that pixel depicting a foreground object(s) or a background object(s)); and
determine that the background is not meaningful based on an output of the AI model (see Zhao Paragraph [0018], Embodiments described herein improve image segmentation techniques built on Artificial Intelligence (AI) or Machine-Learning (ML) models. An image-segmentation task may be designed to process a given image, which could be an image frame in a video, and determine whether each pixel in the image corresponds to the foreground or background, or to particular objects of interests (e.g., humans, pets, furniture, etc.). In particular embodiments, such determination may take the form of a segmentation mask. A segmentation mask, in particular embodiments, may be implemented as an array (or matrix) of values, where each value is associated with one or more groups of pixels in the input image. For example, an input image with n×m pixels may have a corresponding segmentation mask with n×m values. The value associated with each pixel may indicate a likelihood of that pixel depicting a foreground object(s) or a background object(s). For example, each value may be within a numerical range of 0 to 1, where a larger value represents a higher likelihood of the associated pixel belonging to a foreground object, and a lower value represents a lower likelihood of the associate pixel belong to a foreground object (in other words, a lower value may represent a higher likelihood of the associated pixel belonging to a background object)).
It would have been obvious to one of ordinary skill in the art before the effective filing date of
the claimed invention to combine the teaching of a system where a user device receives mixed media data, generates a video from it, and distinguishes a person in the video using a bounding box, analyzes video to determine meaningfulness of a background, and identifies the corresponding pixels as background pixels, in which the system dims background pixels in a video frame by applying a distance based dimming factor and displays the resulting video with a dimmed background (as taught in Marman in view of Steinberg and Cai), with an AI model that identifies meaningful objects and determines a background is not meaningful (as taught in Zhao), the motivation being to allow a system to automatically and dynamically distinguish a background from a video (see Zhao Paragraph [0003]).
Claims 6 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Marman et al. (U.S. Pub. No. 2012/0062732, hereinafter “Marman”) in view of Steinberg et al. (U.S. Pub. No. 2016/0065861, hereinafter “Steinberg”), Cai et al. (U.S. Patent 11,741,918, hereinafter “Cai”) and Dessero et al. (U.S. Pub. No. 2025/0078420, hereinafter “Dessero”).
Regarding Claim 6, Marman in view of Steinberg and Cai teaches all the limitations of claim 1, but does not expressively teach
The system of claim 1, wherein the control circuit is further configured to:
determine that the background is meaningful;
cease applying to the individual background pixels, the respective dimming factor; and
cause the display component to display the video frame with, for the plurality of pixels, the respective original pixel intensity.
However, Dessero teaches
The system of claim 1, wherein the control circuit is further configured to:
determine that the background is meaningful (see Dessero Paragraph [0451], For example, the computer system optionally applies a visual effect to the background (such as dimming) when the virtual environment is displayed as a simulated daytime virtual environment (e.g., a beach or sky during the day, which is optionally relatively bright, includes lighter colors, and/or is tinted more yellow or orange relative to the same environment when it is simulated as a nighttime environment) and forgoes applying the visual effect (or applies a different visual effect, such as less dimming and/or different tinting) when the virtual environment is displayed as a simulated nighttime virtual environment (e.g., a beach or sky at night, which is optionally less bright, includes darker colors, and/or is tinted more blue or gray relative to the same virtual environment when it is simulated as a daytime environment), therefore the dimming effect on a background may be ceased based on time of day, and thus the background has value at “night”);
cease applying to the individual background pixels, the respective dimming factor (see Dessero Paragraph [0451], For example, the computer system optionally applies a visual effect to the background (such as dimming) when the virtual environment is displayed as a simulated daytime virtual environment (e.g., a beach or sky during the day, which is optionally relatively bright, includes lighter colors, and/or is tinted more yellow or orange relative to the same environment when it is simulated as a nighttime environment) and forgoes applying the visual effect (or applies a different visual effect, such as less dimming and/or different tinting) when the virtual environment is displayed as a simulated nighttime virtual environment (e.g., a beach or sky at night, which is optionally less bright, includes darker colors, and/or is tinted more blue or gray relative to the same virtual environment when it is simulated as a daytime environment)); and
cause the display component to display the video frame with, for the plurality of pixels, the respective original pixel intensity (see Dessero Paragraph [0451], For example, the computer system optionally applies a visual effect to the background (such as dimming) when the virtual environment is displayed as a simulated daytime virtual environment (e.g., a beach or sky during the day, which is optionally relatively bright, includes lighter colors, and/or is tinted more yellow or orange relative to the same environment when it is simulated as a nighttime environment) and forgoes applying the visual effect (or applies a different visual effect, such as less dimming and/or different tinting) when the virtual environment is displayed as a simulated nighttime virtual environment (e.g., a beach or sky at night, which is optionally less bright, includes darker colors, and/or is tinted more blue or gray relative to the same virtual environment when it is simulated as a daytime environment), therefore at “night” the display is not changed).
It would have been obvious to one of ordinary skill in the art before the effective filing date of
the claimed invention to combine the teaching of a system where a user device receives mixed media data, generates a video from it, and distinguishes a person in the video using a bounding box, analyzes video to determine meaningfulness of a background, and identifies the corresponding pixels as background pixels, in which the system dims background pixels in a video frame by applying a distance based dimming factor and displays the resulting video with a dimmed background (as taught in Marman in view of Steinberg and Cai), with ceasing to apply a dimming factor when a background is meaningful (as taught in Dessero), the motivation being to allow a system to be context aware and provide better customization of the visual effect of the background (see Dessero Paragraph [0462]).
Regarding Claim 20, it is rejected similarly as Claim 6. The system can be found in Marman (Abstract, system).
Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over Marman et al. (U.S. Pub. No. 2012/0062732, hereinafter “Marman”) in view of Steinberg et al. (U.S. Pub. No. 2016/0065861, hereinafter “Steinberg”), Cai et al. (U.S. Patent 11,741,918, hereinafter “Cai”) and Crook (U.S. Pub. No. 2004/0218035).
Regarding Claim 9, Marman in view of Steinberg and Cai teaches all the limitations of claim 1, but does not expressively teach
The system of claim 1, wherein the mixed media data signal is associated with a video conferencing application.
However, Crook teaches
The system of claim 1, wherein the mixed media data signal is associated with a video conferencing application (see Crook Paragraph [0001], The present invention relates to call set-up techniques, hardware and software interfaces and methods of operating same, for the transmission of mixed-media data across telecommunications networks).
It would have been obvious to one of ordinary skill in the art before the effective filing date of
the claimed invention to combine the teaching of a system where a user device receives mixed media data, generates a video from it, and distinguishes a person in the video using a bounding box, analyzes video to determine meaningfulness of a background, and identifies the corresponding pixels as background pixels, in which the system dims background pixels in a video frame by applying a distance based dimming factor and displays the resulting video with a dimmed background (as taught in Marman in view of Steinberg and Cai), with a mixed media data signal being associated with a video conference (as taught in Crook), the motivation being to be less demanding in terms of bandwidth, and the graphic of the video conference could be adapted to obscure unwanted regions (see Crook Paragraph [0080]).
Claim 10 is rejected under 35 U.S.C. 103 as being unpatentable over Marman et al. (U.S. Pub. No. 2012/0062732, hereinafter “Marman”) in view of Steinberg et al. (U.S. Pub. No. 2016/0065861, hereinafter “Steinberg”), Cai et al. (U.S. Patent 11,741,918, hereinafter “Cai”) and Cheng et al. (U.S. Pub. No. 2025/0265830, hereinafter “Cheng”).
Regarding Claim 10, Marman in view of Steinberg and Cai teaches all the limitations of claim 1, but does not expressively teach
The system of claim 1, wherein the control circuit is further to:
perform a person detection and segmentation operation; and
delineate the person in the video frame from the background with the bounding box responsive to performing the person detection and segmentation operation.
However, Cheng teaches
The system of claim 1, wherein the control circuit is further to:
perform a person detection and segmentation operation (see Cheng Paragraph [0044], the trained-machine learning model A 214 performs image segmentation. Image segmentation can be a preliminary step in other machine-learning processes. For example, image segmentation can improve ML efficiency where the image is to be provided as input for object detection. Rather than processing the whole image, the detector model can be inputted with a region (e.g., showing a face) selected by a segmentation algorithm, Paragraph [0045], the trained-machine learning model A 214 performs object identification. Object identification is a computer vision task that involves identifying and locating objects in images or videos. Object identification involves identifying the presence of objects in an image and locating them using a bounding box or blob. The output is one or more bounding boxes or delineated blobs, each with a class label corresponding to an object identified, such as face); and
delineate the person in the video frame from the background with the bounding box responsive to performing the person detection and segmentation operation (see Cheng Paragraph [0044], the trained-machine learning model A 214 performs image segmentation. Image segmentation can be a preliminary step in other machine-learning processes. For example, image segmentation can improve ML efficiency where the image is to be provided as input for object detection. Rather than processing the whole image, the detector model can be inputted with a region (e.g., showing a face) selected by a segmentation algorithm, Paragraph [0045], the trained-machine learning model A 214 performs object identification. Object identification is a computer vision task that involves identifying and locating objects in images or videos. Object identification involves identifying the presence of objects in an image and locating them using a bounding box or blob. The output is one or more bounding boxes or delineated blobs, each with a class label corresponding to an object identified, such as face).
It would have been obvious to one of ordinary skill in the art before the effective filing date of
the claimed invention to combine the teaching of a system where a user device receives mixed media data, generates a video from it, and distinguishes a person in the video using a bounding box, analyzes video to determine meaningfulness of a background, and identifies the corresponding pixels as background pixels, in which the system dims background pixels in a video frame by applying a distance based dimming factor and displays the resulting video with a dimmed background (as taught in Marman in view of Steinberg and Cai), with performing person detection and segmentation to delineate a person from a background using a bounding box (as taught in Cheng), the motivation being prevention from processing an entire image by using a region selected by segmentation, therefore reducing inference time (see Cheng Paragraph [0044]).
Claim 11 is rejected under 35 U.S.C. 103 as being unpatentable over Marman et al. (U.S. Pub. No. 2012/0062732, hereinafter “Marman”) in view of Steinberg et al. (U.S. Pub. No. 2016/0065861, hereinafter “Steinberg”), Cai et al. (U.S. Patent 11,741,918, hereinafter “Cai”) and Parker et al. (U.S. Pub. No. 2014/0366040, hereinafter “Parker”).
Regarding Claim 11, Marman in view of Steinberg and Cai teaches all the limitations of claim 1, but does not expressively teach
The system of claim 1, wherein the control circuit is further to:
detect when an audio or video conferencing application is not running in a foreground mode; and
cause the display component to cease displaying when the audio or video conferencing application is not running in the foreground mode.
However, Parker teaches
The system of claim 1, wherein the control circuit is further to:
detect when an audio or video conferencing application is not running in a foreground mode (see Parker Paragraph [0034], In one embodiment, the application events can be an event that may affect whether the application should enter an application sleep state. In one embodiment, an application event could be a transition from foreground to background (or vice versa), full or partial occlusion of the application window, minimizing of one or more the application window(s), the application starting or ceasing playing audio, detection of a connection or disconnection to a camera or microphone input device (video camera, still camera, other type of camera, and/or microphone) and receiving input (or not receiving input) from them, if the application is processing a user event (or ceases processing a user event), the application entering/expiration of a grace period, holding/releasing of a power assertion, opting in or out of application sleep, a change in display status (e.g., a display going for being on to off (powered off or turned off to save power), or visa versa), or another type of event that would allow, disallow, otherwise affect an application sleep state, or some other application event); and
cause the display component to cease displaying when the audio or video conferencing application is not running in the foreground mode (see Parker Paragraph [0034], an application can determine when the application can enter or modify an application's sleep state and signal the kernel to do so. FIG. 3 is a flowchart of one embodiment of a process 300 to detect an event that causes a modification of an application's sleep state. In one embodiment, process 300 is performed by an application sleep management module such as the application sleep management module 110A-N of FIG. 1 that is described above. In FIG. 3, process 300 begins by monitoring the application for application events at block 302. In one embodiment, the application events can be an event that may affect whether the application should enter an application sleep state. In one embodiment, an application event could be a transition from foreground to background (or vice versa), full or partial occlusion of the application window, minimizing of one or more the application window(s), the application starting or ceasing playing audio, detection of a connection or disconnection to a camera or microphone input device (video camera, still camera, other type of camera, and/or microphone) and receiving input (or not receiving input) from them, if the application is processing a user event (or ceases processing a user event), the application entering/expiration of a grace period, holding/releasing of a power assertion, opting in or out of application sleep, a change in display status (e.g., a display going for being on to off (powered off or turned off to save power), or visa versa), or another type of event that would allow, disallow, otherwise affect an application sleep state, or some other application event).
It would have been obvious to one of ordinary skill in the art before the effective filing date of
the claimed invention to combine the teaching of a system where a user device receives mixed media data, generates a video from it, and distinguishes a person in the video using a bounding box, analyzes video to determine meaningfulness of a background, and identifies the corresponding pixels as background pixels, in which the system dims background pixels in a video frame by applying a distance based dimming factor and displays the resulting video with a dimmed background (as taught in Marman in view of Steinberg and Cai), with detecting when an application is not running in the foreground and subsequently closing the application (as taught in Parker), the motivation being to decreased unnecessary device resources (see Parker Paragraph [0003]).
Claims 12, 13, 15 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Becker et al. (U.S. Pub. No. 20240403484, hereinafter “Becker”) in view of Marman et al. (U.S. Pub. No. 2012/0062732, hereinafter “Marman”), Cai et al. (U.S. Patent 11,741,918, hereinafter “Cai”) and Steinberg et al. (U.S. Pub. No. 2016/0065861, hereinafter “Steinberg”).
Regarding Claim 12, Becker teaches
A non-transitory computer-readable media (see Becker Paragraph [0082], tangible computer-readable storage medium also can be non-transitory in nature) comprising instructions that are, when executed by processing circuitry, to:
and wherein the metadata associated with the video frame further provides a background blur flag that is asserted when the background is not meaningful (see Becker Paragraph [0041], The processing flow involves the flag being transmitted from, and/or indicated by, the application 202 to the rendering engine 223 and then to the compositing engine 227 where the flag may be included in, e.g., per frame metadata or per frame information. Subsequently, the flag is passed on to the recording process, enabling the implementation of the blurring effect by way of obfuscating one or more frames, or at least a portion of a frame. The privacy flag may indicate whether multiple layers of the frame require obfuscation, such as in implementations where the distinction between the multiple layers is preserved beyond the compositing engine 227. For example, each layer may represent a separate component or element that contributes to the overall visual representation in the computer-generated reality environment. These layers can include the background scenery, in which the area [background scenery] is privacy-sensitive content and thus not meaningful to contribute to a streaming session, and Paragraph [0056], the recording process can query an alternative process (not shown) to obtain metadata information regarding virtual objects in the frame, such as depth, position, privacy-sensitive, and the like. The recording process may independently conduct such queries to gather relevant details about objects present within the frame, enabling it to augment the understanding of the visual content and thereby effectuate, for example, a per object blurring/obfuscation instead of blurring/obfuscating the entire frame);
Becker does not expressively teach
receive a video frame from a video conferencing application on an external device;
receive metadata associated with the video frame, wherein the metadata delineates a person from a background in the video frame using a bounding box, and wherein the bounding box may be one of an n-sided polygon, a curved shape, generated by depth information, or a combination thereof;
wherein the video frame comprises, for a plurality of pixels, a respective original pixel intensity;
identify pixels of the plurality of pixels that are associated with the background as background pixels;
for individual background pixels, determine a respective dimming factor that reduces the respective original pixel intensity, and wherein the dimming factor is a function of distance from the bounding box, and ranges from a minimum amount adjacent to the bounding box to a maximum amount at an edge of the video frame;
create a dimmed background by applying to the individual background pixels, the respective dimming factor; and
cause a display component to display a converted video frame with the dimmed background.
However, Marman teaches
receive a video frame from a video conferencing application on an external device (see Marman Paragraph [0038], When an object of interest is present in the scene, video analytics 120 send to display management module 340 the metadata corresponding to location information of the object of interest. Display management module 340 also receives the first set of high resolution image data produced by imager 115. The first set of image data may be sent to display management module 340 from video analytics 120, imager 115, or data storage system 255, Paragraph [0048], display management module 340 and a storage device 390 of data storage system 255 are remote from camera 110, and Paragraph [0020], camera 110 is a megapixel video camera including a high resolution megapixel imager 115 implemented with an advanced pixel architecture for capturing images of a field of view of camera 110);
receive metadata associated with the video frame, wherein the metadata delineates a person from a background in the video frame using a bounding box (see Marman Paragraph [0038], When an object of interest is present in the scene, video analytics 120 send to display management module 340 the metadata corresponding to location information of the object of interest. Display management module 340 also receives the first set of high resolution image data produced by imager 115. The first set of image data may be sent to display management module 340 from video analytics 120, imager 115, or data storage system 255, Paragraph [0048], display management module 340 and a storage device 390 of data storage system 255 are remote from camera 110, Paragraph [0020], camera 110 is a megapixel video camera including a high resolution megapixel imager 115 implemented with an advanced pixel architecture for capturing images of a field of view of camera 110, and Paragraph [0063] and Figure 7, Image 700 contains representations of a first object 720 (a first person) and a second object 730 (a second person) captured in the scene. Video analytics 120 detect first object 720 and second object 730 and recognize them as objects of interest. Video analytics 120 may also recognize objects 720 and 730 as humans and generate corresponding object classification metadata. When video analytics 120 detect first and second objects 720 and 730, video analytics 120 generate bounding boxes 740 and 750 surrounding, respectively, first object 720 and second object 730. Bounding boxes 740 and 750 correspond to location information of first and second objects 720 and 730, therefore the bounding boxes encapsulate the person(s) in the video, and the bounding boxes separate the person from the remainder of the image (or background)),
It would have been obvious to one of ordinary skill in the art before the effective filing date of
the claimed invention to combine the teaching of metadata associated with a video frame provides a blur flag that is asserted when a background is not meaningful (as taught in Becker), with receiving a video from an external device and associated metadata that delineates a person from the background using a bounding box (as taught in Marman), the motivation being to enable a system to focus on important regions to avoid additional processing and reduce bandwidth loads of a network (see Marman Paragraph [0060]).
Becker in view of Marman does not expressively teach
wherein the bounding box may be one of an n-sided polygon, a curved shape, generated by depth information, or a combination thereof;
wherein the video frame comprises, for a plurality of pixels, a respective original pixel intensity;
identify pixels of the plurality of pixels that are associated with the background as background pixels;
for individual background pixels, determine a respective dimming factor that reduces the respective original pixel intensity, and wherein the dimming factor is a function of distance from the bounding box, and ranges from a minimum amount adjacent to the bounding box to a maximum amount at an edge of the video frame;
create a dimmed background by applying to the individual background pixels, the respective dimming factor; and
cause a display component to display a converted video frame with the dimmed background.
However, Ozawa teaches
wherein the bounding box may be one of an n-sided polygon, a curved shape, generated by depth information, or a combination thereof (see Ozawa Paragraph [0036], The bounding shape may be any suitable shape (e.g., a circle, a box, a square, a rectangle, a polygon, an ellipse, or any other suitable shape, or any combination thereof));
It would have been obvious to one of ordinary skill in the art before the effective filing date of
the claimed invention to combine the teaching of metadata associated with a video frame provides a blur flag that is asserted when a background is not meaningful (as taught in Becker), with receiving a video from an external device and associated metadata that delineates a person from the background using a bounding box (as taught in Marman), the motivation being to enable a system to focus on important regions to avoid additional processing and reduce bandwidth loads of a network (see Marman Paragraph [0060]).
It would have been further obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of with receiving a video from an external device and associated metadata that; delineates a person from the background using a bounding box, and provides a blur flag that is asserted when a background is not meaningful (as taught in Becker in view of Marman), with implementing an n-sided polygon bounding box (as taught in Ozawa), the motivation being to implement a bounding box that is fitted to the object of interest to accentuate or emphasize the object thus creating a more accurate bounding box with a stronger differentiation (see Ozawa Paragraph [0036]).
Becker in view of Marman and Ozawa does not expressively teach
wherein the video frame comprises, for a plurality of pixels, a respective original pixel intensity;
identify pixels of the plurality of pixels that are associated with the background as background pixels;
for individual background pixels, determine a respective dimming factor that reduces the respective original pixel intensity, and wherein the dimming factor is a function of distance from the bounding box, and ranges from a minimum amount adjacent to the bounding box to a maximum amount at an edge of the video frame;
create a dimmed background by applying to the individual background pixels, the respective dimming factor; and
cause a display component to display a converted video frame with the dimmed background.
However, Cai teaches
wherein the video frame comprises, for a plurality of pixels, a respective original pixel intensity (see Cai Column 10, lines 61 – 64, Dimming circuitry 110 may receive image data for a given display frame. The image data (sometimes referred to as pixel data, pixel brightness data, initial brightness values, etc.) includes a brightness value for each pixel in the display, and Column 12, lines 16 – 17, Each pixel may have an initial brightness value in the image data for a given display frame);
for individual background pixels, determine a respective dimming factor that reduces the respective original pixel intensity, and wherein the dimming factor is a function of distance from the bounding box, and ranges from a minimum amount adjacent to the bounding box to a maximum amount at an edge of the video frame (see Cai Abstract, The vignetting effect causes the display to have a light-emitting area that gradually fades to a black, non-light-emitting area. The vignetting effect allows for the size and shape of the light-emitting area to be controlled while still being aesthetically pleasing to a viewer. To implement a vignetting mask for the vignetting effect, control circuitry in the electronic device may apply dimming factors to image data for the display. To avoid artifacts caused by the vignetting mask, an initial array of dimming factors for implementing the vignetting mask may have a range between −0.5 and 1.5. After upsampling, clamping may be performed to reduce the range of the dimming factors to between 0 and 1. Ultimately, the dimming factors are applied to image data for each frame to implement the vignetting mask in each frame, Column 1, lines 39 – 42, To implement a vignetting mask for the vignetting effect, control circuitry in the electronic device may apply dimming factors to image data for the display. The dimming factors may cause the gradual fade desired for the vignetting effect, and Column 1, lines 43 - 47, an initial array of dimming factors for implementing the vignetting mask may have a range between −0.5 and 1.5. The initial array may include dimming factors for only a subset of pixels in the display, and Figure 4, pixels that have a dimming factor applied to them are outside the central emphasis region or light-emitting area 52, therefore background pixels when the area 52 is the visible area of the display that changes according to the viewer’s gaze (Column 6, lines 62 – 64));
create a dimmed background by applying to the individual background pixels, the respective dimming factor (see Cai Column 12, lines 17 – 27, The dimming factor for that respective pixel may be applied to the initial brightness value for that respective pixel. The dimming factor may be multiplied by the initial brightness value or other processing (e.g., another function) may be used to apply the dimming factor to the initial brightness value. The result is a modified brightness value for the respective pixel. The modified image data (with a modified brightness value for each pixel) is provided to the display (e.g., display driver circuitry) and displayed on the array of pixels, and Figure 4, pixels that have a dimming factor applied to them are outside the central emphasis region or light-emitting area 52, therefore background pixels when the area 52 is the visible area of the display that changes according to the viewer’s gaze (Column 6, lines 62 – 64)); and
cause a display component to display a converted video frame with the dimmed background (see Cai Column 12, lines 17 – 27, The dimming factor for that respective pixel may be applied to the initial brightness value for that respective pixel. The dimming factor may be multiplied by the initial brightness value or other processing (e.g., another function) may be used to apply the dimming factor to the initial brightness value. The result is a modified brightness value for the respective pixel. The modified image data (with a modified brightness value for each pixel) is provided to the display (e.g., display driver circuitry) and displayed on the array of pixels, and Figure 4, pixels that have a dimming factor applied to them are outside the central emphasis region or light-emitting area 52, therefore background pixels when the area 52 is the visible area of the display that changes according to the viewer’s gaze (Column 6, lines 62 – 64)).
It would have been obvious to one of ordinary skill in the art before the effective filing date of
the claimed invention to combine the teaching of metadata associated with a video frame provides a blur flag that is asserted when a background is not meaningful (as taught in Becker), with receiving a video from an external device and associated metadata that delineates a person from the background using a bounding box (as taught in Marman), the motivation being to enable a system to focus on important regions to avoid additional processing and reduce bandwidth loads of a network (see Marman Paragraph [0060]).
It would have been further obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of with receiving a video from an external device and associated metadata that; delineates a person from the background using a bounding box, and provides a blur flag that is asserted when a background is not meaningful (as taught in Becker in view of Marman), with implementing an n-sided polygon bounding box (as taught in Ozawa), the motivation being to implement a bounding box that is fitted to the object of interest to accentuate or emphasize the object thus creating a more accurate bounding box with a stronger differentiation (see Ozawa Paragraph [0036]).
It would have been further obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of with receiving a video from an external device and associated metadata that; delineates a person from the background using an n-sided polygon shaped bounding box, and provides a blur flag that is asserted when a background is not meaningful (as taught in Becker in view of Marman and Ozawa), with a system that dims background pixels in a video frame by applying a distance based dimming factor and displays the resulting video with a dimmed background (as taught in Cai), the motivation being to provide a strong focus on a video subject by gradually fading the surrounding screen area to black, and improve power conservation in the device (see Cai Column 4, lines 38 – 42, Column 7, lines 2 – 5, and Column 1, lines 32 - 38).
Becker in view of Marman, Ozawa and Cai does not expressively teach
identify pixels of the plurality of pixels that are associated with the background as background pixels;
However, Steinberg teaches
identify pixels of the plurality of pixels that are associated with the background as background pixels (see Steinberg Paragraph [0045], The method includes identifying within a digital image acquisition device one or more groups of pixels that correspond to a background region or a foreground region, or both, within an original digitally-acquired image);
It would have been obvious to one of ordinary skill in the art before the effective filing date of
the claimed invention to combine the teaching of metadata associated with a video frame provides a blur flag that is asserted when a background is not meaningful (as taught in Becker), with receiving a video from an external device and associated metadata that delineates a person from the background using a bounding box (as taught in Marman), the motivation being to enable a system to focus on important regions to avoid additional processing and reduce bandwidth loads of a network (see Marman Paragraph [0060]).
It would have been further obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of with receiving a video from an external device and associated metadata that; delineates a person from the background using a bounding box, and provides a blur flag that is asserted when a background is not meaningful (as taught in Becker in view of Marman), with implementing an n-sided polygon bounding box (as taught in Ozawa), the motivation being to implement a bounding box that is fitted to the object of interest to accentuate or emphasize the object thus creating a more accurate bounding box with a stronger differentiation (see Ozawa Paragraph [0036]).
It would have been further obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of with receiving a video from an external device and associated metadata that; delineates a person from the background using an n-sided polygon shaped bounding box, and provides a blur flag that is asserted when a background is not meaningful (as taught in Becker in view of Marman and Ozawa), with a system that dims background pixels in a video frame by applying a distance based dimming factor and displays the resulting video with a dimmed background (as taught in Cai), the motivation being to provide a strong focus on a video subject by gradually fading the surrounding screen area to black, and improve power conservation in the device (see Cai Column 4, lines 38 – 42, Column 7, lines 2 – 5, and Column 1, lines 32 - 38).
It would have been further obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of receiving a video from an external device and associated metadata that; delineates a person from the background using an n-sided polygon shaped bounding box, and provides a blur flag that is asserted when a background is not meaningful, and dims background pixels in a video frame by applying a distance based dimming factor and displays the resulting video with a dimmed background (as taught in Becker in view of Marman, Ozawa and Cai), with a system that analyzes video to identify the corresponding pixels as background pixels (as taught in Steinberg), the motivation being to provide a more accurate subject isolation and context awareness in a video (see Steinberg Paragraph [0016] and [0107]).
Regarding Claim 13, Becker in view of Marman, Ozawa, Cai and Steinberg teaches
The non-transitory computer-readable media of claim 12, wherein the instructions are further to vary the respective dimming factor non-linearly as a function of distance from the bounding box (see Cai Column 1, lines 43 – 49, To avoid artifacts caused by the vignetting mask, an initial array of dimming factors for implementing the vignetting mask may have a range between −0.5 and 1.5. The initial array may include dimming factors for only a subset of pixels in the display. Upsampling (e.g., bilinear interpolation) may be performed to obtain dimming factors for the remaining pixels in the display).
Regarding Claim 15, Becker in view of Marman, Ozawa, Cai and Steinberg teaches
The non-transitory computer-readable media of claim 12, wherein the instructions are further to:
detect when the background blur flag is de-asserted (see Becker Paragraph [0042], A safeguard can be set in place to ensure that the system only stops applying the blurring effect when both the application 202 and the keyboard process agree that there is no longer a privacy concern, such as by no longer setting the privacy flag and/or indicator, and Paragraph [0057], The processing continues until the recording process receives frame indicating that blurring/obfuscation no longer needs to be applied, at which time the recording process resumes outputting the un-obfuscated frames); and
cause the display component to display the video frame with, for the plurality of pixels, the respective original pixel intensity (see Becker Paragraph [0042], A safeguard can be set in place to ensure that the system only stops applying the blurring effect when both the application 202 and the keyboard process agree that there is no longer a privacy concern, such as by no longer setting the privacy flag and/or indicator, and Paragraph [0057], The processing continues until the recording process receives frame indicating that blurring/obfuscation no longer needs to be applied, at which time the recording process resumes outputting the un-obfuscated frames).
Regarding Claim 16, Becker in view of Marman, Ozawa, Cai and Steinberg teaches
The non-transitory computer-readable media of claim 12, wherein the instructions are further to:
detect when the background blur flag is de-asserted (see Becker Paragraph [0042], A safeguard can be set in place to ensure that the system only stops applying the blurring effect when both the application 202 and the keyboard process agree that there is no longer a privacy concern, such as by no longer setting the privacy flag and/or indicator);
identify pixels of the plurality of pixels that are associated with the background as background pixels (see Steinberg Paragraph [0045], The method includes identifying within a digital image acquisition device one or more groups of pixels that correspond to a background region or a foreground region, or both, within an original digitally-acquired image);
for individual background pixels, determine a respective dimming factor that reduces the respective original pixel intensity, and wherein the respective dimming factor is a function of distance from the bounding box and ranges from a minimum amount adjacent to the bounding box to a maximum amount at an edge of the video frame (see Cai Abstract, The vignetting effect causes the display to have a light-emitting area that gradually fades to a black, non-light-emitting area. The vignetting effect allows for the size and shape of the light-emitting area to be controlled while still being aesthetically pleasing to a viewer. To implement a vignetting mask for the vignetting effect, control circuitry in the electronic device may apply dimming factors to image data for the display. To avoid artifacts caused by the vignetting mask, an initial array of dimming factors for implementing the vignetting mask may have a range between −0.5 and 1.5. After upsampling, clamping may be performed to reduce the range of the dimming factors to between 0 and 1. Ultimately, the dimming factors are applied to image data for each frame to implement the vignetting mask in each frame, Column 1, lines 39 – 42, To implement a vignetting mask for the vignetting effect, control circuitry in the electronic device may apply dimming factors to image data for the display. The dimming factors may cause the gradual fade desired for the vignetting effect, and Column 1, lines 43 - 47, an initial array of dimming factors for implementing the vignetting mask may have a range between −0.5 and 1.5. The initial array may include dimming factors for only a subset of pixels in the display);
create a dimmed background by applying to the individual background pixels, the respective dimming factor (see Cai Column 12, lines 17 – 27, The dimming factor for that respective pixel may be applied to the initial brightness value for that respective pixel. The dimming factor may be multiplied by the initial brightness value or other processing (e.g., another function) may be used to apply the dimming factor to the initial brightness value. The result is a modified brightness value for the respective pixel. The modified image data (with a modified brightness value for each pixel) is provided to the display (e.g., display driver circuitry) and displayed on the array of pixels); and
cause the display component to display a converted video frame with the dimmed background (see Cai Column 12, lines 17 – 27, The dimming factor for that respective pixel may be applied to the initial brightness value for that respective pixel. The dimming factor may be multiplied by the initial brightness value or other processing (e.g., another function) may be used to apply the dimming factor to the initial brightness value. The result is a modified brightness value for the respective pixel. The modified image data (with a modified brightness value for each pixel) is provided to the display (e.g., display driver circuitry) and displayed on the array of pixels).
Claim 17 is rejected under 35 U.S.C. 103 as being unpatentable over Becker et al. (U.S. Pub. No. 20240403484, hereinafter “Becker”) in view of Marman et al. (U.S. Pub. No. 2012/0062732, hereinafter “Marman”), Cai et al. (U.S. Patent 11,741,918, hereinafter “Cai”), Steinberg et al. (U.S. Pub. No. 2016/0065861, hereinafter “Steinberg”) and Parker et al. (U.S. Pub. No. 2014/0366040, hereinafter “Parker”).
Regarding Claim 17, Becker in view of Marman, Ozawa, Cai and Steinberg teach all the limitations of claim 12, but does not expressively teach
The non-transitory computer-readable media of claim 12, wherein the instructions are further to:
detect when an audio or video conferencing application is not running in a foreground; and
cause the display component to cease displaying when the audio or video conferencing application is not running in the foreground.
However, Parker teaches
The non-transitory computer-readable media of claim 12, wherein the instructions are further to:
detect when an audio or video conferencing application is not running in a foreground (see Parker Paragraph [0034], In one embodiment, the application events can be an event that may affect whether the application should enter an application sleep state. In one embodiment, an application event could be a transition from foreground to background (or vice versa), full or partial occlusion of the application window, minimizing of one or more the application window(s), the application starting or ceasing playing audio, detection of a connection or disconnection to a camera or microphone input device (video camera, still camera, other type of camera, and/or microphone) and receiving input (or not receiving input) from them, if the application is processing a user event (or ceases processing a user event), the application entering/expiration of a grace period, holding/releasing of a power assertion, opting in or out of application sleep, a change in display status (e.g., a display going for being on to off (powered off or turned off to save power), or visa versa), or another type of event that would allow, disallow, otherwise affect an application sleep state, or some other application event); and
cause the display component to cease displaying when the audio or video conferencing application is not running in the foreground (see Parker Paragraph [0034], an application can determine when the application can enter or modify an application's sleep state and signal the kernel to do so. FIG. 3 is a flowchart of one embodiment of a process 300 to detect an event that causes a modification of an application's sleep state. In one embodiment, process 300 is performed by an application sleep management module such as the application sleep management module 110A-N of FIG. 1 that is described above. In FIG. 3, process 300 begins by monitoring the application for application events at block 302. In one embodiment, the application events can be an event that may affect whether the application should enter an application sleep state. In one embodiment, an application event could be a transition from foreground to background (or vice versa), full or partial occlusion of the application window, minimizing of one or more the application window(s), the application starting or ceasing playing audio, detection of a connection or disconnection to a camera or microphone input device (video camera, still camera, other type of camera, and/or microphone) and receiving input (or not receiving input) from them, if the application is processing a user event (or ceases processing a user event), the application entering/expiration of a grace period, holding/releasing of a power assertion, opting in or out of application sleep, a change in display status (e.g., a display going for being on to off (powered off or turned off to save power), or visa versa), or another type of event that would allow, disallow, otherwise affect an application sleep state, or some other application event).
It would have been obvious to one of ordinary skill in the art before the effective filing date of
the claimed invention to combine the teaching of receiving a video from an external device and associated metadata that; delineates a person from the background using an n-sided polygon shaped bounding box, and provides a blur flag that is asserted when a background is not meaningful, identifies pixels as background pixels, and dims background pixels in a video frame by applying a distance based dimming factor and displays the resulting video with a dimmed background (as taught in Becker in view of Marman, Ozawa, Cai and Steinberg), with detecting when an application is not running in the foreground and subsequently closing the application (as taught in Parker), the motivation being to decreased unnecessary device resources (see Parker Paragraph [0003]).
Allowable Subject Matter
Claim 14 is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Refer to PTO-892, Notice of References Cited for a listing of analogous art.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CARISSA A JONES whose telephone number is (703)756-1677. The examiner can normally be reached Telework M-F 6:30 AM - 4:00 PM CT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Duc Nguyen can be reached at 5712727503. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/CARISSA A JONES/Examiner, Art Unit 2691
/DUC NGUYEN/Supervisory Patent Examiner, Art Unit 2691