Last updated: April 19, 2026

Application No. 18/114,531

RESTRICTED MULTI-SCALE INFERENCE FOR MACHINE LEARNING

Final Rejection §103

Filed

Feb 27, 2023

Examiner

BALI, VIKKRAM

Art Unit

2663

Tech Center

2600 — Communications

Assignee

Zoox Inc.

OA Round

2 (Final)

Interview Optional

— +11.3% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 626 resolved cases, 2023–2026

Examiner Intelligence

BALI, VIKKRAM View full profile →

Grants 82% — above average

Career Allow Rate

510 granted / 626 resolved

+19.5% vs TC avg

Moderate +11% lift

Without

With

+11.3%

Interview Lift

resolved cases with interview

Typical timeline

2y 11m

Avg Prosecution

34 currently pending

Career history

660

Total Applications

across all art units

Statute-Specific Performance

§101

16.7%

-23.3% vs TC avg

§103

51.2%

+11.2% vs TC avg

§102

7.8%

-32.2% vs TC avg

§112

18.9%

-21.1% vs TC avg

Black line = Tech Center average estimate • Based on career data from 626 resolved cases

Office Action

§103

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
All amendments to claims filed on 9/15/2025 have been entered and action follows:

Response to Arguments
Applicant’s arguments with respect to claim(s) have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 2, 3, 6, 7-10, 13, 15-18 and 21 are rejected under 35 U.S.C. 103 as being unpatentable over Tang et al (US 10,157,331) in view of A multi-scale multiple instance video description network, by Xu.  
With respect to claim 7 (exemplary claim), Tang discloses One or more non-transitory computer-readable media storing instructions that, when executed, cause one or more processors to perform operations comprising:
providing, as input to a first machine-learning (ML) model associated [with a first size range], an image; determining, by the first ML model and based at least in part on the image and the first size range, a first region of interest (ROI), (see figure 8, numerical 800 and 801, the image is receive and input to a neural network to have output as the bounding box “ROI”);
determining that the first ROI is associated with a first size that is outside the first size range, (see figure 3, 301a, 301b and 301c are the ROI and figure 7, 705, eliminating bounding boxes “determining …outside the first size range”);
suppressing, based at least in part on the first size being outside the first size range,  the first ROI; and receiving, as an second output from the first ML model, a second ROI associated with an object or a first indication that a dimension of the object is outside the first size range, (see col. 3, lines 14-19, wherein …paring the matrix by using non-maximum suppression; ranking bounding boxes defined by the pared matrix by composite scores based on associated areas and highest associated class confidence scores “ a second ROI associated with an object”…; also col. 8, lines 64 to col. 9 line 3, wherein …processor may eliminate “suppressing” second selected bounding boxes that do not include the center point of the image and do not include a point within the buffer (e.g., within 10 pixels, within 20 pixels, within 5% of a width or height of the image, within 10% of the width or height of the image [this is read as “the ROI size”], or the like)…), as claimed.  
However, Tang fails to explicitly disclose providing, as input to a first machine-learning (ML) model associated with a first size range, an image, (emphasis added) as claimed.  
Xu teaches providing, as input to a first machine-learning (ML) model associated with a first size range, an image, (emphasis added; see figure 2 for three CNN’s with three size, the output of the three are different size) as claimed.  
It would have been obvious to one ordinary skilled in the art at the effective date of invention to combine the two references as they are analogous because they are solving similar problem of object tracking using image analysis.  Teaching of Xu can be incorporated into Tang system for preprocessing of the image to attain the bounding box for the object, (see Tang figure 8), for suggestion, and modifying Tang yields a multiscale object detection system (see Xu Abstract) for motivation.   

With respect to claim 8, wherein receiving the second ROI associated with the object or the first indication comprise receiving the first indication, and wherein receiving the first indication that the dimension of the object is outside the first size range is based at least in part on determining that the first ROI includes all of a plurality of ROIs, (see col. 8, lines 64 to col. 9 line 3, wherein …processor may eliminate “suppressing” second selected bounding boxes that do not include the center point of the image and do not include a point within the buffer (e.g., within 10 pixels, within 20 pixels, within 5% of a width or height of the image, within 10% of the width or height of the image [this is read as “first indication that the dimension of the object is outside the first size range is based at least in part on determining that the first ROI includes all of a plurality of ROIs”], or the like)), as claimed.  

With respect to claim 9, combination of Tang and Xu further discloses wherein the operations further comprise: providing, as input to a second ML model associated with a second size range, the image; determining, by the second ML model, a third ROL suppressing the third ROI, wherein suppressing the third ROI comprises determining that at least a portion of the third ROI is associated with a second size that is outside the second size range; and receiving, from the second ML model, a fourth ROI associated with the object or a second indication that the dimension of the object is outside the second size range, (see TANG figure 8, numerical 800 and 801, the image is receive and input to a neural network to have output as the bounding box “ROI”; see col. 3, lines 14-19, wherein …paring the matrix by using non-maximum suppression; ranking bounding boxes defined by the pared matrix by composite scores based on associated areas and highest associated class confidence scores “ a second ROI associated with an object”…; also col. 8, lines 64 to col. 9 line 3, wherein …processor may eliminate “suppressing” second selected bounding boxes that do not include the center point of the image and do not include a point within the buffer (e.g., within 10 pixels, within 20 pixels, within 5% of a width or height of the image, within 10% of the width or height of the image [this is read as “the ROI size”], or the like)…; and see Xu figure 2 for three CNN’s with three size associate with them for the image input, the output of the three are different size), as claimed.  

With respect to claim 10, combination of Tang and Xu further discloses wherein a fifth ROI corresponding to the object is received from the first ML model or the second ML model, based at least in part on the dimension of the object in the image, the first size range, and the second size range, (see Tang figure 3, 301c, the ROI), as claimed.

With respect to claim 13, combination of Tang and Xu further discloses wherein the operations further comprise: selecting the first size range and the second size range based at least in part on a machine learned model, (see Xu the CNN’s “machine leaned model”), as claimed.

With respect to claim 15, combination of Tang and Xu further discloses wherein the operations further comprise: receiving a batch of images, wherein the batch of images comprises a first predefined number of images that are associated with a first object classification and a second predefined number of images that are associated with a second object classification; and training the first ML model based at least in part on providing the batch of images as input to the first ML model, wherein the first predefined number and the second predefined number are based at least in part on a confidence score associated with the first ML model or a second ML model, (see Xu page 5 training/learning of CNNs, left hand column, section 3.3, wherein …learning of semantic concepts in the video frame would require labeled data in the form of frames and object labels at each location and scale “a first predefined number of images that are associated with a first object classification and a second predefined number of images that are associated with a second object classification”. Since such data is very difficult to obtain, we resort to a multiple instance learning approach…), as claimed.

Claims 2-3 and 6 are rejected for the same reasons as set forth in the rejections of claims 7, 9, and 15, because claims 2, 3, 5 and 6 are claiming subject matter of similar scope as claimed in claims 7, 9, and 15 respectively.  

Claims 16, 18 and 21 are rejected for the same reasons as set forth in the rejections of claims 7, 9, and 15, because claims 16, 18 and 21 are claiming subject matter of similar scope as claimed in claims 7, 9 and 15 respectively.  

Claim 17 is rejected for the same reasons as set forth in the rejections of claim 8, because claim 17 is claiming subject matter of similar scope as claimed in claim 8.  

Claim 12 is rejected under 35 U.S.C. 103 as being unpatentable over Tang et al (US 10,157,331) in view of A multi-scale multiple instance video description network, by Xu as applied to claim 9 above, and further in view of Laddha et al (US Pub. 2018/0348374).  
With respect to claim 12, combination of Tang and Xu further discloses all the elements as claimed and as rejected in claim 9 above.  Furthermore, Tang and Xu discloses generating, based at least in part on at least one of the second ROI or the fourth ROI, object recognition (such as identifying a make and model of a detected vehicle or identifying an architectural style of a detected building…), as claimed.    
Laddha teaches generating, based at least in part on at least one of the second ROI or the fourth ROI, a trajectory for controlling motion of an autonomous vehicle, and controlling the motion of the autonomous vehicle based at least in part on the trajectory, (emphasis added; see paragraph 0040, wherein … the motion planning system can determine a cost function for each of one or more candidate motion plans for the autonomous vehicle based at least in part on the current locations and/or predicted future locations of the objects. For example, the cost function can describe a cost (e.g., over time) of adhering to a particular candidate motion plan. For example, the cost described by a cost function can increase when the autonomous vehicle approaches impact with another object and/or deviates from a preferred pathway (e.g., a predetermined travel route).), as claimed.  
It would have been obvious to one ordinary skilled in the art at the effective date of invention to combine the two references as they are analogous because they are solving similar problem of object tracking using image analysis.  Teaching of Laddha to achieve a pathway for autonomous vehicle can be incorporated into Tang and Xu system for preprocessing of the image to attain object detection, (see Tang figure 1, 107), for suggestion, and modifying system yields an autonomous vehicle trajectory (see Laddha paragraph 0001) for motivation.   

Allowable Subject Matter
Claims 4, 5, 11, 14, 19 and 20 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to VIKKRAM BALI whose telephone number is (571)272-7415. The examiner can normally be reached Monday-Friday 7:00AM-3:00PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Gregory Morse can be reached at 571-272-3838. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/VIKKRAM BALI/Primary Examiner, Art Unit 2663

Read full office action

Prosecution Timeline

Feb 27, 2023

Application Filed

Jun 11, 2025

Non-Final Rejection — §103

Jul 29, 2025

Examiner Interview Summary

Jul 29, 2025

Applicant Interview (Telephonic)

Sep 15, 2025

Response Filed

Mar 09, 2026

Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/193,531

Patent 12602810

TIRE-SIZE IDENTIFICATION METHOD, TIRE-SIZE IDENTIFICATION SYSTEM AND COMPUTER-READABLE STORAGE MEDIUM

2y 5m to grant Granted Apr 14, 2026

17/890,567

Patent 12586208

APPARATUS AND METHOD FOR OPERATING A DENTAL APPLIANCE

2y 5m to grant Granted Mar 24, 2026

18/028,408

Patent 12567248

A CROP SCANNING SYSTEM, PARTS THEREOF, AND ASSOCIATED METHODS

2y 5m to grant Granted Mar 03, 2026

18/229,710

Patent 12561937

METHOD, COMPUTER PROGRAM, PROFILE IDENTIFICATION DEVICE

2y 5m to grant Granted Feb 24, 2026

17/309,109

Patent 12537917

ADAPTATION OF THE RADIO CONNECTION BETWEEN A MOBILE DEVICE AND A BASE STATION

2y 5m to grant Granted Jan 27, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

3-4

Expected OA Rounds

82%

Grant Probability

93%

With Interview (+11.3%)

2y 11m

Median Time to Grant

Moderate

PTA Risk

Based on 626 resolved cases by this examiner. Grant probability derived from career allow rate.