Last updated: April 19, 2026

Application No. 18/691,374

METHOD FOR SEMANTIC SEGMENTATION OF AN IMAGE, AND DEVICE

Non-Final OA §101§102§103

Filed

Mar 12, 2024

Examiner

VAUGHN, ALEXANDER JOSEPH

Art Unit

2675

Tech Center

2600 — Communications

Assignee

Robert Bosch GmbH

OA Round

1 (Non-Final)

Interview Optional

— +28.6% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 15 resolved cases, 2023–2026

Examiner Intelligence

VAUGHN, ALEXANDER JOSEPH View full profile →

Grants 73% — above average

Career Allow Rate

11 granted / 15 resolved

+11.3% vs TC avg

Strong +29% interview lift

Without

With

+28.6%

Interview Lift

resolved cases with interview

Typical timeline

2y 10m

Avg Prosecution

20 currently pending

Career history

Total Applications

across all art units

Statute-Specific Performance

§101

6.3%

-33.7% vs TC avg

§103

52.5%

+12.5% vs TC avg

§102

30.0%

-10.0% vs TC avg

§112

11.3%

-28.7% vs TC avg

Black line = Tech Center average estimate • Based on career data from 15 resolved cases

Office Action

§101 §102 §103

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Objections
Claim 1 is objected to because of the following informalities:
	“arragement" should read “arrangement”.
Claim 24 is objected to because of the following informalities:
	“arragement" should read “arrangement”.
	Claim 27 is objected to because of the following informalities:
		“arragement" should read “arrangement”.
Appropriate correction is required.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 15-27 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter as follows. Regarding claims 15, 24, 25, and 27, the claims are directed to an abstract idea, namely mathematical operation and information processing. The claims are not integrated into a practical application and the claims lack an inventive concept. Furthermore, claims 16-21, 23, and 26 are also directed to an abstract idea, specifically, evaluating features and classifying objects. Claim 22 is rejected to as being dependent upon a rejected base claim. Thus, all the listed claims are considered non-statutory subject matter.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.



	Claims 15-16, 20-27 are rejected under 35 U.S.C. 102(a)(1) and 35 U.S.C. 102(a)(2) as being anticipated by Lee et al. (US 20190042860 A1), hereinafter Lee.

	Regarding claim 15, Lee teaches A method for semantic segmentation of an image that has been captured by an environment detection arrangement of an automatedly moving device, (Abstract see "Disclosed is a method and apparatus of detecting an object of interest, where the apparatus acquires an input image, sets a region of interest (ROI) in the input image, and detects the object of interest from a restoration image." Para. 38 see "the examples described herein have been provided merely to illustrate some of the many possible ways of implementing the methods, apparatuses, and/or systems described herein." Para. 108 see "The instructions or software to control computing hardware, for example, one or more processors or computers, to implement the hardware components and perform the methods as described above, and any associated data, data files, and data structures, may be recorded, stored, or fixed in or on one or more non-transitory computer-readable storage media." Para. 46 see "an autonomous driving vehicle captures objects such as, for example, a vehicle, a human, a traffic light, a traffic sign, and a lane using various capturing devices when recognizing a road situation." Para. 72 see "the detecting apparatus performs the segmentation on the input image by dividing an object in the input image in a semantic unit, analyzing significations of the divided regions of the object in a pixel unit, and labeling the divided regions for each class."). using a computing system having computing power, (Para. 88 see "FIG. 9 is a diagram illustrating an example of an apparatus for detecting an object of interest. A detecting apparatus 900 includes camera(s) 910, a detector 930, a region of interest (ROI) setter 950, a super resolution image generator 970, and a determiner 990. The detector 930, the ROI setter 950, the super resolution image generator 970, and the determiner 990 may be implemented by one or more processors."). the method comprising the following steps: obtaining the image and selecting a region in the image; (Para. 51 see "an apparatus for detecting an object of interest acquires an input image." Para. 54 see "the detecting apparatus sets at least one region of interest (ROI) in the input image. The ROI includes at least one of a portion of a region of the input image or at least one object included in the input image." Para. 55 see "the detecting apparatus may set the road vanishing point as an ROI by performing road segmentation based on deep learning or set a box having a low reliability score as an ROI by detecting a bounding box."). assigning each segment of a plurality of segments including image points of the image, (Para. 72 see "the detecting apparatus performs the segmentation on the input image by dividing an object in the input image in a semantic unit, analyzing significations of the divided regions of the object in a pixel unit, and labeling the divided regions for each class."). one of a plurality of classes within a scope of the semantic segmentation; (Para. 72 see "a class includes twenty classifications, such as, for example, a lane, a vanishing point, a road, a vehicle, a sidewalk, a human, an animal, a building, and sky. An identifying apparatus may accurately detect positions of a lane, a road vanishing point, and a vehicle from a pixel-based label included in the input image on which the segmentation is performed."). based on a ratio of the selected region to the image, using a higher proportion of the computing power for the selected region of the image than for the rest of the image; (Para. 56 see "the detecting apparatus generates a restoration image corresponding to the ROI. In this example, the restoration image has a resolution greater than or equal to a resolution of the input image. In an example, the restoration image is acquired by restoring, in high-resolution or super-resolution, a portion corresponding to the ROI in the input image." Para. 87 see "The restoration image, i.e., a super resolution image is generated by applying a super resolution technology only to an ROI rather than the entire input image such that a detection speed and a detection accuracy of an object of interest are enhanced and a load of a processor is reduced." (Examiner note: The region is always smaller than the overall image so the ratio is less than 1. Only the regions of interest are processed as an up-sampled images, which uses a higher proportion of computing power, to be more efficient with computing resources.)). and generating and outputting a classified resulting image. (Para. 61 see "the detecting apparatus projects the detected object of interest, the restoration image, and/or the input image to a front glass or a separate screen of the vehicle using a head-up display (HUD)." Para. 104 see "The display 1090 may display the input image and the restoration image including the object of interest"). 

	Regarding claim 16, Lee teaches The method according to claim 15. wherein the segments of the image are each assigned one of a plurality of classes by features being determined for each of the segments of the image, and wherein each class is assigned based on the features. (Para. 58 see "The detecting apparatus may detect the object of interest from the restoration image using, for example, a convolution neural network (CNN), a deep neural network (DNN), and a support vector machine that are pre-trained to recognize objects of interest such as a road marking and a vehicle." Para. 72 see "the detecting apparatus performs the segmentation on the input image by dividing an object in the input image in a semantic unit, analyzing significations of the divided regions of the object in a pixel unit, and labeling the divided regions for each class... a class includes twenty classifications, such as, for example, a lane, a vanishing point, a road, a vehicle, a sidewalk, a human, an animal, a building, and sky."). 

	Regarding claim 20, Lee teaches The method according to claim 15. wherein the region in the image is selected based on a position of the environment detection arrangement in the device, with respect to a plane on which the device moves. (Para. 23 see "detect a road vanishing point where a lane marking converges based on a segmentation of the lane marking in the input image, and to set a region corresponding to the road vanishing point as the ROI." Para. 47 see "an object located far away from the autonomous driving vehicle... an image captured by a right camera and pixels of an image captured by a left camera." Para. 89 see "The camera 910 captures an external image of a front view of a vehicle." see Fig. 7 (Examiner note: A region may be selected as the forward-view road vanishing point which is attached to the autonomous vehicle. This inherently includes the position of the camera with repect to the plane on which the vehicle moves.)). 

	Regarding claim 21, Lee teaches The method according to claim 15. wherein the region in the image is selected based on a current position of the device within an environment. (Para. 23 see "detect a road vanishing point where a lane marking converges based on a segmentation of the lane marking in the input image, and to set a region corresponding to the road vanishing point as the ROI." Para. 47 see "an object located far away from the autonomous driving vehicle... an image captured by a right camera and pixels of an image captured by a left camera." Para. 89 see "The camera 910 captures an external image of a front view of a vehicle." see Fig. 7 (Examiner note: A region may be selected as the forward-view road vanishing point which is attached to the autonomous vehicle. This inherently includes the position of the camera within the environment.)). 

	Regarding claim 22, Lee teaches The method according to claim 15. wherein the classified resulting image is used to control the device. (Para. 62 see "The autonomous driving vehicle may easily search for a route and control vertical and horizontal directions by identifying a distant traffic sign, a distant pedestrian, and a distant traffic light using the restoration image including the object of interest received from the detecting apparatus."). 

	Regarding claim 23, Lee teaches The method according to claim 15. wherein the device is a robot, or a robotic mower, or a domestic robot, or a robot vacuum cleaner, or a wiping robot, or a floor cleaning device, or a road cleaning device, or an at least partly automated vehicle, or a drone. (Para. 42 see "The vehicle described herein refers to any mode of transportation, delivery, or communication such as, for example, an automobile, a truck, a tractor, a scooter, a motorcycle, a cycle, an amphibious vehicle, a snowmobile, a boat, a public transit vehicle, a bus, a monorail, a train, a tram, an autonomous or automated driving vehicle, an intelligent vehicle, a self-driving vehicle, an unmanned aerial vehicle, an electric vehicle (EV), a hybrid vehicle, or a drone."). 

	Claim 24 is rejected under the same analysis as claim 15 above.
	Claim 25 is rejected under the same analysis as claim 15 above.
	Claim 26 is rejected under the same analysis as claim 23 above.
	Claim 27 is rejected under the same analysis as claim 15 above.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.



	Claims 17-19 are rejected under 35 U.S.C. 103 as being unpatentable over Lee et al. (US 20190042860 A1), hereinafter Lee, in view of Zhao et al.: "ICNet for Real-Time Semantic Segmentation on High-Resolution Images", Arxiv.org Cornell University Library, submitted 27 Apr 2017, [retrieved on 2-26-2026]. Retrieved from the internet <https://arxiv.org/abs/1704.08545v2>, hereinafter Zhao.

	Regarding claim 17, Lee teaches The method according to claim 16. 
	Lee does not teach wherein: (i) determining the features for the selected region and the features for the rest of the image is performed using artificial intelligence-based pattern recognition methods including artificial neural networks, which are different from one another and/or have a different depth and/or have a different number of layers, and/or (ii) determining the features for the selected region is performed using an additional, artificial intelligence-based pattern recognition method including an artificial neural network, with respect to the rest of the image. 
	However, Zhao teaches wherein: (i) determining the features for the selected region and the features for the rest of the image is performed using artificial intelligence-based pattern recognition methods including artificial neural networks, which are different from one another and/or have a different depth and/or have a different number of layers, and/or (ii) determining the features for the selected region is performed using an additional, artificial intelligence-based pattern recognition method including an artificial neural network, with respect to the rest of the image. (Pg. 5, Para. 2 see "Our proposed system image cascade network (ICNet) does not simply choose either way. Instead it takes cascade image inputs (i.e., low-, medium- and high resolution images), adopts cascade feature fusion unit (Sec. 3.3) and is trained with cascade label guidance (Sec. 3.4). The new architecture is illustrated inFig. 2. The input image with full resolution (e.g., 1024 ×2048 in Cityscapes [7]) is downsampled by factors of 2 and 4, forming cascade input to medium- andhigh-resolution branches." Pg. 5, Para. 3 see " Light weighted CNNs (green dotted box) are adopted in higher resolution branches." Fig. 2 shows multiple CNNs or branches). 
	It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Lee to incorporate the teachings of Zhao to implement additional neural networks to process high resolution areas and low resolution areas separately. Doing so would predictably increase the speed and computational efficiency of the method by balancing high processing for regions in the image that require high precision with lower processing for the rest of the image.

	Regarding claim 18, Lee in view of Zhao teaches The method according to claim 17. 
	Lee does not teach wherein, before determining the features, only the rest of the image is scaled down with regard to dimensions to be considered. 
	However, Zhao teaches wherein, before determining the features, only the rest of the image is scaled down with regard to dimensions to be considered. (Pg. 5, Para. 2 see "The input image with full resolution (e.g., 1024 ×2048 in Cityscapes [7])is downsampled by factors of 2 and 4, forming cascade input to medium- and high-resolution branches." Pg. 5, Para. 3 see "we get semantic extraction using low-resolution input as shown in top branch of Fig. 2. A 1/4 sized image is fed into PSPNet with downsampling rate 8, resulting in a 1/32-resolution feature map." Fig. 2 shows downsampling before feature extraction.). 
	It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Lee and Zhao to incorporate the teachings of Zhao to scale down the rest of the image before determining features. Doing so would predictably increase the speed and computational efficiency of the method by determining features from a smaller amount of data (smaller image).

	Regarding claim 19, Lee in view of Zhao teaches The method according to claim 18. 
	Lee does not teach wherein the rest of the image is scaled up again after determining the features and before assigning the classes. 
	However, Zhao teaches wherein the rest of the image is scaled up again after determining the features and before assigning the classes. (Pg. 6, Section 3.3, Para. 1 see "We first apply upsampling rate 2 on F1 through bilinear interpolation, yielding the same spatial size as F2. Then a dilated convolution layer with kernel size C3 ×3×3 and dilation 2 is applied to refine the upsampled features."). 
	It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Lee and Zhao to incorporate the teachings of Zhao to scale up the rest of the image after determining features and before assigning classes. Doing so would predictably increase the accuracy of the method by matching the spatial dimensions of the rest of the image with the high resolution region of interest image.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Balasubramanian (US 20190102640 A1) discloses detecting objects in an image with neural networks and accelerating CNN computation throughput.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ALEXANDER J VAUGHN whose telephone number is (571) 272-5253. The examiner can normally be reached M-F 8:30-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, ANDREW MOYER can be reached on (571) 272-9523. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/ALEXANDER JOSEPH VAUGHN/Examiner, Art Unit 2675                                                                                                                                                                                                        

/EDWARD PARK/Primary Examiner, Art Unit 2675

Read full office action

Prosecution Timeline

Mar 12, 2024

Application Filed

Mar 04, 2026

Non-Final Rejection — §101, §102, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/306,693

Patent 12591955

SYSTEMS AND METHODS FOR GENERATING DYNAMIC DARK CURRENT IMAGES

2y 5m to grant Granted Mar 31, 2026

17/947,889

Patent 12579756

GRAPHICAL ASSISTANCE WITH TASKS USING AN AR WEARABLE DEVICE

2y 5m to grant Granted Mar 17, 2026

18/306,339

Patent 12573010

IMAGE PROCESSING APPARATUS, RADIATION IMAGING SYSTEM, IMAGE PROCESSING METHOD, AND NON-TRANSITORY COMPUTER-READABLE STORAGE MEDIUM

2y 5m to grant Granted Mar 10, 2026

18/081,060

Patent 12567265

VEHICLE, CONTROL METHOD THEREOF AND CAMERA MONITORING APPARATUS

2y 5m to grant Granted Mar 03, 2026

18/286,162

Patent 12521061

Method of Determining the Effectiveness of a Treatment on a Face

2y 5m to grant Granted Jan 13, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

1-2

Expected OA Rounds

73%

Grant Probability

99%

With Interview (+28.6%)

2y 10m

Median Time to Grant

Low

PTA Risk

Based on 15 resolved cases by this examiner. Grant probability derived from career allow rate.