Last updated: May 29, 2026

Application No. 18/652,898

SYSTEMS AND METHODS FOR VIDEO ENCODING USING IMAGE SEGMENTATION

Final Rejection §103

Filed

May 02, 2024

Priority

Nov 04, 2021 — provisional 63/275,677 +2 more

Examiner

KWAN, MATTHEW K

Art Unit

2482

Tech Center

2400 — Computer Networks

Assignee

Op Solutions LLC

OA Round

2 (Final)

Interview Optional

— +34.6% interview lift. Examiner has a relatively high allowance rate (70%); +34.6% interview lift. A written response may suffice.

Based on 364 resolved cases, 2023–2026

Examiner Intelligence

KWAN, MATTHEW K View full profile →

Grants 70% — above average

Career Allowance Rate

255 granted / 364 resolved

+12.1% vs TC avg

Strong +35% interview lift

Without

With

+34.6%

Interview Lift

resolved cases with interview

Typical timeline

2y 12m

Avg Prosecution

11 currently pending

Career history

385

Total Applications

across all art units

Statute-Specific Performance

§101

1.0%

-39.0% vs TC avg

§103

90.3%

+50.3% vs TC avg

§102

4.8%

-35.2% vs TC avg

§112

1.5%

-38.5% vs TC avg

Black line = Tech Center average estimate • Based on career data from 364 resolved cases

Office Action

§103

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Objections
Claim 1 is objected to because of the following informalities: on the last 2 lines, the first instance of “object of interest” is preceded by “the” while the second instance is preceded by “the”. As best understood by the Examiner based on fig. 5 of the Applicant’s Specification as filed, the first instance should be preceded by “an” and the second instance should be preceded by “the”. In other words, the first instance still refers to “the object of interest” with no preceding “object of interest” in the claim.
A similar issue occurs in the amendment of claim 10, as best understood by the Examiner, “an object of interest” is referred to twice in the claim implying 2 separate objects of interest (i.e. one “object of interest” related to the boundary limitation and another separate “object of interest” related to the partitioning limitation). As best understood by the Examiner, based on fig. 5 of the Applicant’s Specification as filed, the object of interest should be the same, meaning the second instance of “object of interest” in the claim should be “the object of interest”.
Appropriate correction is required.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 1 and 3 is/are rejected under 35 U.S.C. 103 as being unpatentable over Hsiang (U.S. 2020/0195924) in view of Ratner et al. (U.S. 2017/0078664), hereinafter Ratner.

Regarding claim 1, Hsiang discloses a method of encoding a video signal comprising: 
receiving a video frame comprising a plurality of pixels (Hsiang [0057]); 
partitioning the video frame into a plurality of coding tree units (CTUs) (Hsiang [0057]); and 
for a CTU in which a boundary is identified, partition the CTU into at least two coding units (CUs) in which at least one CU contains at least a portion of the region of interest and at least one CU does not contain the region of interest (Hsiang [0057]).
Hsiang does not explicitly disclose performing object detection and image segmentation on the video frame to generate object recognition data and at least one segmentation mask identifying object boundaries; overlay the segmentation mask with the plurality of CTUs; and for a CTU in which an object boundary is identified, partition the CTU into at least two coding units (CUs) in which at least one CU contains at least a portion of the object of interest and at least one CU does not contain the object of interest.
However, Ratner teaches performing object detection and image segmentation on the video frame to generate object recognition data and at least one segmentation mask identifying object boundaries (Ratner [0009] and [0028]); 
overlay the segmentation mask with the plurality of CTUs (Ratner [0034]); and 
for a CTU in which an object boundary is identified, partition the CTU into at least two coding units (CUs) in which at least one CU contains at least a portion of the object of interest and at least one CU does not contain the object of interest (Ratner [0035], [0028] and [0009]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Hsiang’s method with the missing limitations as taught by Ratner to improve coding efficiency (Ratner [0020]).

Regarding claim 3, Hsiang in view of Ratner teaches the method of claim 1, wherein the partition is selected from the group including a horizontal partition, a vertical partition, and a geometric partition (Hsiang fig. 1 and claim 4).


Claim(s) 2, 9-11 and 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Hsiang in view of Ratner as applied to claim 1 above, and further in view of Haskell et al. (U.S. 2013/0170563), hereinafter Haskell.

Regarding claim 2, Hsiang in view of Ratner teaches the method of claim 1. Hsiang does not explicitly disclose encoding a CU with at least one of a resolution or quantization parameter determined at least in part by whether the CU contains an object of interest.
However, Haskell teaches a method, further comprising encoding a CU with at least one of a resolution or quantization parameter determined at least in part by whether the CU contains an object of interest (Haskell Abstract, [0091], Table 5, [0093] and [0096]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the method taught by Hsiang in view of Ratner with the missing limitations as taught by Haskell to enhance both the spatial and temporal resolution of a video object (Haskell Abstract).
	
	Regarding claim 9, Hsiang in view of Ratner and Haskell teaches the method of claim 2, further comprising the step of motion estimation where the motion estimation is performed at least in part based on the object recognition data (Ratner [0028] and [0031]-[0032] and Haskell [0070]).
The same motivation for claims 1 and 2 applies to claim 9.

	Regarding claim 10, Hsiang in view of Ratner and Haskell teaches a video encoder, the video encoder receiving video frame data comprising a plurality of pixels (Hsiang [0057]), the video frame being partitioned into a plurality of coding tree units (CTU) (Hsiang [0057]), the encoder comprising: 
an image detection and segmentation processor (Hsiang [0061] and Ratner[0044]), the image detection and segmentation processor receiving the video frame and generating object recognition data and at least one image segmentation mask (Ratner [0009] and [0028]); 
a mask to coding block mapping processor (Hsiang [0061] and Ratner[0044]) mapping the at least one segmentation mask to the CTUs of the video frame (Ratner [0034]) and partitioning at least one CTU into a plurality of coding units (CUs) based on a detected boundary of an object of interest in the CTU (Hsiang [0057] and Ratner [0035]), wherein partitioning results in a first CU containing at least a portion of an object of interest and a second CU that excludes any portion of the object of interest (Hsiang [0057] and Ratner [0035], [0028] and [0009]); and
a video encoding processor (Hsiang [0061], Ratner[0044] and Haskell [0035]), the encoding processor receiving the video frame, the object recognition data and the partitioned CUs and encoding the CUs with at least one of a resolution or quantization parameter determined based at least in part on whether the CU includes an object (Haskell [0091], Table 5, [0093] and [0096]).
	The same motivation for claims 1 and 2 applies to claim 10.

Regarding claim 11, Hsiang in view of Ratner and Haskell teaches the encoder of claim 10, wherein the mask to coding block mapping processor partitions a CTU using a partition selected from the group including a horizontal partition, a vertical partition and a geometric partition (see claim 3).

	Regarding claim 17, Hsiang in view of Ratner and Haskell teaches the encoder of claim 10, further comprising motion estimation processing, wherein motion estimation is performed at least in part based on the object recognition data (see claim 9).
	The same motivation for claim 9 applies to claim 17.

Claim(s) 4-8 is/are rejected under 35 U.S.C. 103 as being unpatentable over Hsiang in view of Ratner as applied to claim 1 above, and further in view of Vajda et al. (U.S. 2019/0171903), hereinafter Vajda.

Regarding claim 4, Hsiang in view of Ratner teaches the method of claim 1. Hsiang does not explicitly disclose wherein the image segmentation is selected from the group including semantic segmentation, instance segmentation, and panoptic segmentation.
However, Vajda teaches a method, wherein the image segmentation is selected from the group including semantic segmentation, instance segmentation, and panoptic segmentation (Vajda [0022]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the method taught by Hsiang in view of Ratner with the missing limitations as taught by Vajda to enable computing devices with limited resources to recognize objects captured in images or videos (Vajda [0005]).

	Regarding claim 5, Hsiang in view of Ratner and Vajda teaches the method of claim 1, wherein the object recognition data includes instance labels for each object detected in the video frame (Vajda [0091]).
	The same motivation for claim 4 applies to claim 5.

	Regarding claim 6, Hsiang in view of Ratner and Vajda teaches the method of claim 1, wherein the object recognition data includes instance labels for each of the pixels in the video frame (Vajda [0068]-[0069]).
The same motivation for claim 4 applies to claim 6.

	Regarding claim 7, Hsiang in view of Ratner and Vajda teaches the method of claim 1, wherein the object recognition data includes object class (Ratner [0039] and Vajda [0092]) and object position in the frame (Vajda [0122] and [0114]).
The same motivation for claims 1 and 4 applies to claim 7.

	Regarding claim 8, Hsiang in view of Ratner and Vajda teaches the method of claim 7, wherein the object recognition data further comprises a bounding box of an object (Vajda [0022]).
The same motivation for claim 4 applies to claim 8.
	
Claim(s) 12-16 and 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Hsiang in view of Ratner and Haskell as applied to claim 10 above, and further in view of Vajda.

Regarding claim 12, Hsiang in view of Ratner, Haskell and Vajda teaches the encoder of claim 10, wherein the image detection and segmentation processor applies an image segmentation method selected from the group including semantic segmentation, instance segmentation, and panoptic segmentation (see claim 4).
	The same motivation for claim 4 applies to claim 12.

	Regarding claim 13, Hsiang in view of Ratner, Haskell and Vajda teaches the encoder of claim 10, wherein the object recognition data includes instance labels for each object detected in the video frame (see claim 5).
	The same motivation for claim 5 applies to claim 13.

	Regarding claim 14, Hsiang in view of Ratner, Haskell and Vajda teaches the encoder of claim 10, wherein the object recognition data includes instance labels for each of the pixels in the video frame (see claim 6).
	The same motivation for claim 6 applies to claim 14.

	Regarding claim 15, Hsiang in view of Ratner, Haskell and Vajda teaches the encoder of claim 10, wherein the object recognition data includes object class and object position in the frame (see claim 7).
The same motivation for claim 7 applies to claim 14.

	Regarding claim 16, Hsiang in view of Ratner, Haskell and Vajda teaches the encoder of claim 10, wherein the object recognition data further comprises a bounding box of an object (see claim 8).
The same motivation for claim 8 applies to claim 14.

	Regarding claim 18, Hsiang in view of Ratner, Haskell and Vajda teaches the encoder of claim 10, wherein the image detection and segmentation processor includes a neural network (Vajda Abstract).
The same motivation for claim 4 applies to claim 18.

Response to Arguments
Applicant's arguments filed 1/29/26 in regards to the previously presented portions of the claims have been fully considered but they are not persuasive.

On pgs. 5-7 of the Applicant’s Response, the Applicant argues that the cited references do not teach the “object of interest” related limitations of the independent claims.
The Examiner respectfully disagrees. As cited above, Hsiang discloses splitting an in-bounds CU and out-of-bounds CU (i.e. a CU / block including a region of interest and a CU / block not containing a region of interest) (Hsiang [0057]). Ratner teaches block splitting that separates objects such as moving and stationary objects (Ratner [0035] and [0028]). Under the broadest reasonable interpretation of the current claim language defining “an object of interest”, Ratner teaches that separating a moving or stationary object (i.e. either can be an “object of interest” as currently defined in the claim) when detecting moving or stationary objects while the stationary or moving object is not the object of interest, respectively (Ratner [0035], [0028] and [0009], partitioning with different moving or stationary objects in different blocks). Therefore, the combination of Hsiang and Ratner teaches the amended limitations of the independent claims as currently written.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to MATTHEW KWAN whose telephone number is (571)270-7073. The examiner can normally be reached Monday-Friday 9am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Chris Kelley can be reached at (571)272-7331. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/MATTHEW K KWAN/Primary Examiner, Art Unit 2482

Read full office action

Prosecution Timeline

May 02, 2024

Application Filed

Jun 03, 2025

Non-Final Rejection mailed — §103

Jan 24, 2026

Response after Non-Final Action

Jan 29, 2026

Response Filed

Feb 25, 2026

Final Rejection mailed — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/924,545

Patent 12641261

METHOD FOR DECODING VIDEO FROM VIDEO BITSTREAM, METHOD FOR ENCODING VIDEO, VIDEO DECODER, AND VIDEO ENCODER

1y 7m to grant Granted May 26, 2026

18/924,776

Patent 12641262

METHOD FOR DECODING VIDEO FROM VIDEO BITSTREAM, METHOD FOR ENCODING VIDEO, VIDEO DECODER, AND VIDEO ENCODER

1y 7m to grant Granted May 26, 2026

17/866,036

Patent 12634431

Context Coding for Transform Skip Mode

3y 10m to grant Granted May 19, 2026

18/965,219

Patent 12634502

IMAGE DECODING DEVICE USING DIFFERENTIAL CODING

1y 5m to grant Granted May 19, 2026

18/965,341

Patent 12634503

IMAGE DECODING DEVICE USING DIFFERENTIAL CODING

1y 5m to grant Granted May 19, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

3-4

Expected OA Rounds

70%

Grant Probability

99%

With Interview (+34.6%)

2y 12m (~11m remaining)

Median Time to Grant

Moderate

PTA Risk

Based on 364 resolved cases by this examiner. Grant probability derived from career allowance rate.