Prosecution Insights
Last updated: April 19, 2026
Application No. 18/755,269

MACHINE LEARNING MODELS FOR ADAPTIVE POST-PROCESSING USING RESULTS OF SEGMENTATION IN CONFERENCING TOOLS

Non-Final OA §101§103
Filed
Jun 26, 2024
Examiner
TIEU, BINH KIEN
Art Unit
2694
Tech Center
2600 — Communications
Assignee
Microsoft Technology Licensing, LLC
OA Round
1 (Non-Final)
87%
Grant Probability
Favorable
1-2
OA Rounds
2y 5m
To Grant
97%
With Interview

Examiner Intelligence

Grants 87% — above average
87%
Career Allow Rate
809 granted / 931 resolved
+24.9% vs TC avg
Moderate +10% lift
Without
With
+9.8%
Interview Lift
resolved cases with interview
Typical timeline
2y 5m
Avg Prosecution
25 currently pending
Career history
956
Total Applications
across all art units

Statute-Specific Performance

§101
6.1%
-33.9% vs TC avg
§103
43.9%
+3.9% vs TC avg
§102
26.5%
-13.5% vs TC avg
§112
4.1%
-35.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 931 resolved cases

Office Action

§101 §103
DETAILED ACTION Claim Rejections - 35 USC § 101 35 U.S.C. 101 reads as follows: Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title. Claim 20 is rejected under 35 U.S.C. 101 because as not falling with one of the four statutory categories of invention. Regarding claim 20, the claimed invention is directed to non-statutory subject matter The claim 20 does not fall within at least one of the four categories of patent eligible subject matter because the independent claim 19 recited in the its preamble that "One or more computer-readable medium having stored therein computer-executable instructions…” According to the specification, paragraph [0186], "…The term “non-transitory computer-readable media” specifically excludes transitory propagating signals, carrier waves, and wave forms or other intangible or transitory media that may nevertheless be readable by a computer…” Thus, terms “One or more computer-readable medium” can be interpreted as either “non-transitory computer-readable media” or “transitory computer-readable media”. Therefore, claim 20 directs to a transitory computer-readable medium (i.e., communication media) storing (computer readable) instructions or the like, wherein the transitory computer-read medium is a computer software and not a physical article or object and as such as is not a machine or manufacture. The computer software is not a combination of Diamond V. Diehr, 450 U.S. 175, 184 (1981); Parker V. Flook, 437 U.S. 588 n.9 (1978); Gottschalk V. Benson, 409 U.S. 63, 70 (1972); Cochrane V. Deener, 94 U.S. 780, 787-88(1876). Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention. Claims 1-2, 4-5, 8-10, 13, 16 and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Faramarzi et al. (US 2015/0172726) in view of Yang et al. (US 2023/0044969). Regarding claim 1, Faramarzi et al. (hereinafter “Faramarzi”) teaches a client computing device (i.e., UE 116, as shown in figure 3) comprising a processor system (i.e., main processor 340) and memory (i.e., memory 360; para.[0035]-[0034]), wherein the client computing device implements a conferencing tool (i.e., system 600a or 600b, as shown in figures 6A and 6B) configured to perform operations. Faramarzi further teaches the system, as shown in figure 6A, comprising a video-encoding system 605a comprising an encoder 625a and a video-decoding system 650a comprising a decoder 655a. Faramarzi further teaches the encoder 625a encode video to generate an output (encoded data) included a bitstream that includes the encoded video along with metadata (para.[0057]). Faramarzi further teaches the decoder 655a receiving the output from the encoder 625a of the video-encoding system 605a. Faramarzi further teaches the decoder 655a to decode the output received from the video-encoding system 605a and to generate a decoded video and further generate a super-resolved video by the super-resolution processor 660a. Then, the post-processing block 665a to perform post-processing on the super-resolved video to generate a more enhanced video as an output of the video decoding system 650a (para.[0058]). It should be noticed that Faramarzi failed to clearly teach the features of obtaining segmentation information for the decoded video for the current unit of the video sequence, the segmentation information indicating one or more foreground segments of the decoded video for the current unit of the video sequence and applying a trained post-processing model to the one or more foreground segments of the decoded video for the current unit of the video sequence but not to one or more other segments of the decoded video for the current unit of the video sequence. However, Yang et al. (hereinafter “Yang”) teaches a system 100, as shown in figure 1, comprising client devices 104 and a cloud network 102 which comprises a plurality of computing nodes 118 to host a video service 112 (para.[0035]). Yang further teaches the video services 112 comprising a video matting model (e.g., Improved Video Matting (IVM) model; para.[0043]). The IVM model 110 unitizes a framework, such as a framework, as shown in figure 2, comprising an coder 204 and decoder 206 (para.[0044]-[0045]). The encoder 204 extracts a feature representation of the input frame (read on encoded data) and provides the extracted feature presentations of each input frame to the decoder 206 (0048). Yang further teaches the feature of obtaining segmentation information for the decoded video for the current unit of the video sequence, the segmentation information indicating one or more foreground segments of the decoded video for the current unit of the video sequence (i.e., a representation of a foreground object is generated and obtained; para.[0069] and [0075]); and applying a trained post-processing model (the IVM model 110) to the one or more foreground segments of the decoded video for the current unit of the video sequence but not to one or more other segments of the decoded video for the current unit of the video sequence (i.e., the IVM model performs a process to extract a foreground object, such as a human being from any video frame; para.[0045], [0051], [0070]). Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the features of obtaining segmentation information for the decoded video for the current unit of the video sequence, the segmentation information indicating one or more foreground segments of the decoded video for the current unit of the video sequence and applying a trained post-processing model to the one or more foreground segments of the decoded video for the current unit of the video sequence but not to one or more other segments of the decoded video for the current unit of the video sequence, as taught by Yang, into view of Faramarzi in order to provide and display the enhanced video. Regarding claim 2, Yang further teaches limitations of the claim, such as generating the representation of a foreground object (as the segmentation information) included in the particular frame. Yang further teaches the features of determining and extracting a foreground object based on the generated representation of a foreground object (para.[0069]-[0070]). Regarding claim 4, Yang further teaches limitations of the claim, such as the extracted feature representation (as segmentation information) of each input frame (para.[0048]). Temporal information (as metadata) in each frame or bitstream is used to determine the extracted feature representation (para.[0049]). Regarding claim 5, Yang further teaches limitations of the claim, such as the IVM model 110 is utilized and/or performed at the video service 112 in a cloud network 102, as shown in figure 1 (para.[0043]). Yang further teaches framework 200, as shown in figure 2, which may utilize by a video matting mode or the IVM model 110. Thus, the segmentation information is obtained from a server computing device, such as the framework 200 (para.[0046] and [0049]). Regarding claim 8, Yang further teaches limitations of the claim, such as upsampler 306, as shown in figure 6, for upsampling input frames (para.[0062]). Regarding claim 9, Yang further teaches limitations of the claim in paragraphs [0030], [0053], [0056]-[0058]. Regarding claim 10, Yang further teaches limitations of the claim in paragraph [0047]. Regarding claim 13, Yang further teaches limitations of the claim, such as the IVM model may perform or repeat the process to extract a foreground object on from any video frame in paragraph [0070]. Regarding claim 16, Yang further teaches limitations of the claim in paragraph [0036]. Regarding claim 19, Faramarzi in a client computing device (i.e., UE 116, as shown in figure 3) that implements a conferencing tool (i.e., system 600a or 600b, as shown in figures 6A and 6B). Faramarzi further teaches the system, as shown in figure 6A, comprising a video-encoding system 605a comprising an encoder 625a and a video-decoding system 650a comprising a decoder 655a. Faramarzi further teaches the encoder 625a encode video to generate an output (encoded data) included a bitstream that includes the encoded video along with metadata (para.[0057]). Faramarzi further teaches the decoder 655a receiving the output from the encoder 625a of the video-encoding system 605a. Faramarzi further teaches the decoder 655a to decode the output received from the video-encoding system 605a and to generate a decoded video and further generate a super-resolved video by the super-resolution processor 660a. Then, the post-processing block 665a to perform post-processing on the super-resolved video to generate a more enhanced video as an output of the video decoding system 650a (para.[0058]). It should be noticed that Faramarzi failed to clearly teach the features of obtaining segmentation information for the decoded video for the current unit of the video sequence, the segmentation information indicating one or more foreground segments of the decoded video for the current unit of the video sequence and applying a trained post-processing model to the one or more foreground segments of the decoded video for the current unit of the video sequence but not to one or more other segments of the decoded video for the current unit of the video sequence. However, Yang et al. (hereinafter “Yang”) teaches a system 100, as shown in figure 1, comprising client devices 104 and a cloud network 102 which comprises a plurality of computing nodes 118 to host a video service 112 (para.[0035]). Yang further teaches the video services 112 comprising a video matting model (e.g., Improved Video Matting (IVM) model; para.[0043]). The IVM model 110 unitizes a framework, such as a framework, as shown in figure 2, comprising an coder 204 and decoder 206 (para.[0044]-[0045]). The encoder 204 extracts a feature representation of the input frame (read on encoded data) and provides the extracted feature presentations of each input frame to the decoder 206 (0048). Yang further teaches the feature of obtaining segmentation information for the decoded video for the current unit of the video sequence, the segmentation information indicating one or more foreground segments of the decoded video for the current unit of the video sequence (i.e., a representation of a foreground object is generated and obtained; para.[0069] and [0075]); and applying a trained post-processing model (the IVM model 110) to the one or more foreground segments of the decoded video for the current unit of the video sequence but not to one or more other segments of the decoded video for the current unit of the video sequence (i.e., the IVM model performs a process to extract a foreground object, such as a human being from any video frame; para.[0045], [0051], [0070]). Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the features of obtaining segmentation information for the decoded video for the current unit of the video sequence, the segmentation information indicating one or more foreground segments of the decoded video for the current unit of the video sequence and applying a trained post-processing model to the one or more foreground segments of the decoded video for the current unit of the video sequence but not to one or more other segments of the decoded video for the current unit of the video sequence, as taught by Yang, into view of Faramarzi in order to provide and display the enhanced video. Regarding claim 20, Faramarzi teaches one or more computer-readable media (i.e., memory 360; para.[0035]) having stored therein computer-executable instructions for causing a processor system (i.e., main processor 340), when programmed thereby, to perform operations. Faramarzi further teaches the system, as shown in figure 6A, comprising a video-encoding system 605a comprising an encoder 625a and a video-decoding system 650a comprising a decoder 655a. Faramarzi further teaches the encoder 625a encode video to generate an output (encoded data) included a bitstream that includes the encoded video along with metadata (para.[0057]). Faramarzi further teaches the decoder 655a receiving the output from the encoder 625a of the video-encoding system 605a. Faramarzi further teaches the decoder 655a to decode the output received from the video-encoding system 605a and to generate a decoded video and further generate a super-resolved video by the super-resolution processor 660a. Then, the post-processing block 665a to perform post-processing on the super-resolved video to generate a more enhanced video as an output of the video decoding system 650a (para.[0058]). It should be noticed that Faramarzi failed to clearly teach the features of obtaining segmentation information for the decoded video for the current unit of the video sequence, the segmentation information indicating one or more foreground segments of the decoded video for the current unit of the video sequence and applying a trained post-processing model to the one or more foreground segments of the decoded video for the current unit of the video sequence but not to one or more other segments of the decoded video for the current unit of the video sequence. However, Yang et al. (hereinafter “Yang”) teaches a system 100, as shown in figure 1, comprising client devices 104 and a cloud network 102 which comprises a plurality of computing nodes 118 to host a video service 112 (para.[0035]). Yang further teaches the video services 112 comprising a video matting model (e.g., Improved Video Matting (IVM) model; para.[0043]). The IVM model 110 unitizes a framework, such as a framework, as shown in figure 2, comprising an coder 204 and decoder 206 (para.[0044]-[0045]). The encoder 204 extracts a feature representation of the input frame (read on encoded data) and provides the extracted feature presentations of each input frame to the decoder 206 (0048). Yang further teaches the feature of obtaining segmentation information for the decoded video for the current unit of the video sequence, the segmentation information indicating one or more foreground segments of the decoded video for the current unit of the video sequence (i.e., a representation of a foreground object is generated and obtained; para.[0069] and [0075]); and applying a trained post-processing model (the IVM model 110) to the one or more foreground segments of the decoded video for the current unit of the video sequence but not to one or more other segments of the decoded video for the current unit of the video sequence (i.e., the IVM model performs a process to extract a foreground object, such as a human being from any video frame; para.[0045], [0051], [0070]). Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the features of obtaining segmentation information for the decoded video for the current unit of the video sequence, the segmentation information indicating one or more foreground segments of the decoded video for the current unit of the video sequence and applying a trained post-processing model to the one or more foreground segments of the decoded video for the current unit of the video sequence but not to one or more other segments of the decoded video for the current unit of the video sequence, as taught by Yang, into view of Faramarzi in order to provide and display the enhanced video. Claim 3 is rejected under 35 U.S.C. 103 as being unpatentable over Faramarzi et al. (US 2015/0172726) in view of Yang et al. (US 2023/0044969) as applied to claims 1 and 2 above, and further in view of Wang et al. (US 2025/0378619). Regarding claim 3, Faramarzi and Yang, in combination, teach all subject matters as claimed above, except for the feature of determining the segmentation information using a machine learning model having a convolutional U-net architecture. However, Wang et al. (hereinafter “Wang”) teaches a DM 208, as shown in figure 1, representing a U-NET machine learning model used for image segmentation tasks (para.[0076]). Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the feature of determining the segmentation information using a machine learning model having a convolutional U-net architecture, as taught by Wang, into view of Faramarzi and Yang in order to use for video segmentation tasks, such as decoding. Claim 11 is rejected under 35 U.S.C. 103 as being unpatentable over Faramarzi et al. (US 2015/0172726) in view of Yang et al. (US 2023/0044969) as applied to claim 1 above, and further in view of Li et al. (US 2025/0336039). Regarding claim 11, Faramarzi and Yang, in combination, teach all subject matters as claimed above, except for the feature of the trained post-processing model being a super-resolution/video restoration model configured to increase spatial resolution, mitigate compression artifacts, and mitigate upsampling artifacts. However, Li et al. (hereinafter “Li”) teaches the feature of a super-resolution/video restoration model in paragraphs [0014] and [0024]. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the feature of the trained post-processing model being a super-resolution/video restoration model configured to increase spatial resolution, mitigate compression artifacts, and mitigate upsampling artifacts, as taught by Li, into view of Faramarzi and Yang in order to use to perform super-resolution upscaling of an input video. Claim 17 is rejected under 35 U.S.C. 103 as being unpatentable over Faramarzi et al. (US 2015/0172726) in view of Yang et al. (US 2023/0044969) as applied to claim 1 above, in view of Lee (US 2012/0169828). Regarding claim 17, Faramarzi and Yang, in combination, teach all subject matters as claimed above, except for the features of performing video quality analysis; and based at least in part on results of the video quality analysis: adjusting one or more target characteristics of video. However, Lee teaches the features in paragraph [0079] for a purpose of improvement in video quality enhancement. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the features of performing video quality analysis; and based at least in part on results of the video quality analysis: adjusting one or more target characteristics of video, as taught by Lee, into view of Yang in order to improve in video quality enhancement. Allowable Subject Matter Claims 6-7, 12, 14-15 and 18 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims. Any inquiry concerning this communication or earlier communications from the examiner should be directed to BINH TIEU whose telephone number is (571)272-7510. The examiner can normally be reached on 9-5. The Examiner’s fax number is (571) 273-7510 and E-mail address: BINH.TIEU@USPTO.GOV. Examiner interviews are available via telephone or video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, FAN S. TSANG can be reached on (571) 272-7547. Any response to this action should be mailed or handed carry deliveries to: Commissioner of Patents and Trademarks 401 Dulany Street Alexandria, VA 22314 Or faxed to: (571) 273-8300 Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (FAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the FAIR system, see fitp://nair-direct.usoto.aqev. If you have any questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). /Binh Kien Tieu/Primary Examiner, Art Unit 2694 Date: February 2026
Read full office action

Prosecution Timeline

Jun 26, 2024
Application Filed
Mar 13, 2026
Non-Final Rejection — §101, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12603111
AUDIO GUESTBOOK SYSTEMS AND METHODS
2y 5m to grant Granted Apr 14, 2026
Patent 12598223
Dynamic Teleconference Content Item Distribution to Multiple Devices Associated with a User
2y 5m to grant Granted Apr 07, 2026
Patent 12592994
REAL-TIME USER SCREENING OF MESSAGES WITHIN A COMMUNICATION PLATFORM
2y 5m to grant Granted Mar 31, 2026
Patent 12592740
WIRELESS COMMUNICATION DEVICE AND WIRELESS COMMUNICATION METHOD
2y 5m to grant Granted Mar 31, 2026
Patent 12573198
COMMUNICATION SYSTEM, OUTPUT DEVICE, COMMUNICATION METHOD, OUTPUT METHOD, AND OUTPUT PROGRAM
2y 5m to grant Granted Mar 10, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

1-2
Expected OA Rounds
87%
Grant Probability
97%
With Interview (+9.8%)
2y 5m
Median Time to Grant
Low
PTA Risk
Based on 931 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month