Last updated: May 29, 2026

Application No. 18/403,333

METHOD AND SYSTEM FOR EFFICIENT OBJECT DENSITY ESTIMATION USING DYNAMIC INPUT RESOLUTION

Non-Final OA §103

Filed

Jan 03, 2024

Priority

Jan 11, 2023 — provisional 63/479,422

Examiner

KRAYNAK, JACK PETER

Art Unit

2668

Tech Center

2600 — Communications

Assignee

Johnson Controls Tyco Ip Holdings LLP

OA Round

1 (Non-Final)

Interview Optional

— +19.3% interview lift. Examiner has a relatively high allowance rate (79%); +19.3% interview lift. A written response may suffice.

Based on 103 resolved cases, 2023–2026

Examiner Intelligence

KRAYNAK, JACK PETER View full profile →

Grants 79% — above average

Career Allowance Rate

81 granted / 103 resolved

+16.6% vs TC avg

Strong +19% interview lift

Without

With

+19.3%

Interview Lift

resolved cases with interview

Typical timeline

2y 11m

Avg Prosecution

9 currently pending

Career history

117

Total Applications

across all art units

Statute-Specific Performance

§101

0.6%

-39.4% vs TC avg

§103

87.9%

+47.9% vs TC avg

§102

7.5%

-32.5% vs TC avg

§112

4.0%

-36.0% vs TC avg

Black line = Tech Center average estimate • Based on career data from 103 resolved cases

Office Action

§103

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Drawings
The drawings are objected to because: the drawings mailed 4/12/2024, specifically drawings 3C and 3D-3E have an abnormal black box located on the page. The examiner is not sure if this is a printer/copier error or if it is purposeful. Please address these drawings. Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.

Claim Objections
Claims 1, 7, and 13 objected to because of the following informalities: "wherein the first region corresponds to a first portion of the initial object count" is not clear. It is not clear how a first region can correspond to a first portion of the initial object count, as a region cannot correspond to a portion of a count (number) of objects. For the sake of examination, the examiner has interpreted the limitation to split the first image into a first and second region, one with a greater object count (density) and the other with a smaller object count (density). However, though the limitation could potentially be interpreted as this explanation of two separate regions, it is unclear.  Appropriate correction is required.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.


    PNG
    media_image1.png
    347
    785
    media_image1.png
    Greyscale

 
Figure 2 from Li et al incorporated above for easy reference.
Claim(s) 1, 3-5, 7, 9-11, 13, and 15-17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Li et al (NPL: Density Map Guided Object Detection in Aerial Images) in view of Zhang et al (NPL: Finding Nonrigid Tiny Person With Densely Cropped and Local Attention Object Detector Networks in Low-Altitude Aerial Images).

Regarding claim 1, Li et al teaches a method for counting objects in an image (Abstract, Fig 2 and 3. DMNet, Object detection in high-resolution aerial images is a challenging task because of 1) the large variation in object size, and 2) non-uniform distribution of objects. A common solution is to divide the large aerial image into small (uniform) crops and then apply object detection on each small crop. In this paper, we investigate the image cropping strategy to address these challenges. i.e. a method for counting objects detected in an image. See 4.1-4.2 regarding using computer including processor and GPU); comprising:
formatting the first image into a second image having a second size less than the first size (Fig 2 and 3.2.1 Density map generation network, as MCNN [29] introduces two pooling layers, the output feature map will shrink by 4× for both height and width. To preserve the original resolution, we upsample the generated density map by 4× with cubic interpolation to restore the original resolution. For the case where the image height or width is not the multiplier of four, we directly resize the image to its original resolution. i.e. CNN downsamples the first image in its processing and the feature map is shrunk by 4x for height and width);
estimating, using a first object counting model, an initial object count in the second image; comparing the initial object count with an object count threshold (Fig 2 and 3.3.1, the core of DMNet is to properly crop images from the contextual information provided by density maps. As observed from the density mask provided in Fig. 1, the regions with more objects (labeled in yellow color) have higher pixel intensities compared with those with fewer objects. By placing a threshold within a region, we can estimate the object counts and filter out pixels in the region with no or limited objects accordingly. i.e. the first object counting model is model that predicts the density map, which compares the initial object count with an object count threshold);
generating a third image using a first region of the first image in response to the initial object count being greater than the object count threshold, wherein the first region corresponds to a first portion of the initial object count greater than a second portion of the initial object count corresponding to a second region of the first image (Fig 2 and 3.3.2-3.4, we generate image crops based on the density mask. First, we select all the pixels whose corresponding density mask value is “1”. Second, we merge the eight-neighbor connected pixels into a large candidate region. Finally, we use the candidate region’s circumscribed rectangle to crop the original image. We filter out the crops whose resolution is below the density threshold. i.e. generating cropped regions (first region) is based on the initial object count being greater than the object count threshold (see 3.3.1), and has a greater object count (density) than other, lesser dense, image portions);
determining, by a second object counting model, an updated first portion of the initial object count in the third image (Fig 2 and 3.3.1-3.4, after obtaining image crops from the density map, the next step is to detect objects and fuse results from both density crops and the whole image. Any existing modern detectors can be of the choice. We first run separate detection on original validation set and density crops. i.e. using a detector (second object counting model)  to update the object count in the cropped regions (cropped based on object count density)
compiling an updated object count for the first image based on the updated first portion of the initial object count in the third image; and transmitting a notification based on the updated object count (Fig 2 and 3-3.4, after obtaining image crops from the density map, the next step is to detect objects and fuse results from both density crops and the whole image. Any existing modern detectors can be of the choice. We first run separate detection on original validation set and density crops. Then we collect the predicted bounding boxes from density crops detection and add them back to the detection results of original images to fuse them together. Finally, we apply non maximum suppression (NMS) to all bounding boxes and calculate the final results. i.e. fusing the results of object counting/detection from both the cropped sections and the original image as can be seen in Figure 2 is compiling an updated object count for the first (original) image. Outputting a final detection with bounding boxes such as is shown in Figure 2 can be considered transmitting a notification based on the updated object count (based on the object counting and bounding boxes are included in the output). Also see 4. Experiments and Tables 1-2 that output different counts based on the detection results). 
Li et al does not teach, receiving a first image having a first size greater than a size threshold.
In a similar field of endeavor, Zhang et al teaches, receiving a first image having a first size greater than a size threshold (Fig 1 and III. A. General Framework, in the image crop module, different cropping strategies of uniformly and nonuniformly are combined application in the training and inferencing phases only if the image size exceeds the threshold (default is 640). i.e. an image is received that has a size that exceeds a threshold, which then determines its object density cropping strategy).
Therefore, it would have been obvious for one of ordinary skill in the art before the effective filing date to incorporate the teachings of Li et al (NPL: Density Map Guided Object Detection in Aerial Images) in view of Zhang et al (NPL: Finding Nonrigid Tiny Person With Densely Cropped and Local Attention Object Detector Networks in Low-Altitude Aerial Images) so that the method includes receiving a first image having a first size greater than a size threshold. Doing so would allow the system to enhance the size of small objects within small pedestrian target detection with excellent performance and high robustness (III. A. General Framework, Zhang et al).

Regarding claim 3, Li et al teaches the method of claim 1, wherein generating the third image further comprises: partitioning the second image into a plurality of regions using a density map representing the second image, in response to the initial object count being greater than the object count threshold; analyzing the density map to identify a cluster of objects; and identifying a dense region in the second image corresponding to the cluster of objects in the density map (Fig 2 and 3 -3.4, the generated density mask indicates the presence of objects. We generate image crops based on the density mask. First, we select all the pixels whose corresponding density mask value is “1”. Second, we merge the eight-neighbor connected pixels into a large candidate region. Finally, we use the candidate region’s circumscribed rectangle to crop the original image. We filter out the crops whose resolution is below the density threshold. The reasons are: (1) some of the predicted density maps are not in high quality and contain noise that spreads over the whole map given a low density threshold. i.e. if the initial object count is greater than the object count threshold, a cluster of objects is determined from the density map, see Figure 3-4, and a dense region in the second image (sliding window determined to determine density) is identified as a dense region corresponding to the cluster of objects in the density map).  

Regarding claim 4, Li et al teaches the method of claim 3, wherein generating the third image further comprises: mapping the dense region in the second image to the first image to define the first region in the first image; and cropping the first region in the first image to form the third image, the third image corresponding to the cluster of objects in the density map representing the second image, the third image having a same resolution as the first image (Fig 2 and 3.3.2-3.4, the generated density mask indicates the presence of objects. We generate image crops based on the density mask. First, we select all the pixels whose corresponding density mask value is “1”. Second, we merge the eight-neighbor connected pixels into a large candidate region. Finally, we use the candidate region’s circumscribed rectangle to crop the original image. We filter out the crops whose resolution is below the density threshold. The reasons are: (1) some of the predicted density maps are not in high quality and contain noise that spreads over the whole map given a low density threshold. i.e. the dense region is mapped using a sliding window (considered second image) to determine the high-density areas that are cropped (third image). Also, see Fig 2 and 3.3.2, the cropped regions are the same resolution as the first image, and are not downsized/upsampled).  

Regarding claim 5, Li et al teaches the method of claim 1, wherein compiling the updated object count further comprises adding a first object count obtained using the third image to a second object count corresponding to the second region of the first image (Fig 2 and 3-3.4, after obtaining image crops from the density map, the next step is to detect objects and fuse results from both density crops and the whole image. Any existing modern detectors can be of the choice. We first run separate detection on original validation set and density crops. Then we collect the predicted bounding boxes from density crops detection and add them back to the detection results of original images to fuse them together. Finally, we apply non maximum suppression (NMS) to all bounding boxes and calculate the final results. i.e. compiling includes adding (fusing) the results of the cropped region image count (third image) and the original image count (first image)) to output a 'final detection' that includes the final object count.  


Regarding claim 7 and 13, claim 7 and 13 rejected for the same reasons as claim 1 in the combination above.

Regarding claim 9 and 15, claim 9 and 15 rejected for the same reasons as claim 3 in the combination above.

Regarding claim 10 and 16, claim 10 and 16 rejected for the same reasons as claim 4 in the combination above.

Regarding claim 11 and 17, claim 11 and 17 rejected for the same reasons as claim 5 in the combination above.


Claim(s) 2, 8, and 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Li et al (NPL: Density Map Guided Object Detection in Aerial Images) in view of Zhang et al (NPL: Finding Nonrigid Tiny Person With Densely Cropped and Local Attention Object Detector Networks in Low-Altitude Aeral Images) and Yano et al (US 20190333241 A1).

Regarding claim 2, Li et al and Zhang et al do not teach the method of claim 1, further comprising outputting the initial object count in response to the initial object count not exceeding the object count threshold.  
In a similar field of endeavor, Yano et al teaches the method of claim 1, further comprising outputting the initial object count in response to the initial object count not exceeding the object count threshold (Fig 7 and Para 67-68, in the present exemplary embodiment, in a case where it is determined that the input image is an uncrowded image (No in S24), the person detection unit 230 acquires the positions of persons (S25). In a case where it is determined that the input image is a crowded image, the crowd number-of-people estimation unit 240 acquires the positions of persons (S26). In a situation in which a certain place is crowded with people, it is difficult for the person detection unit 230 to detect persons with high accuracy because the persons overlap one another in an input image and some portions of the persons are hidden. In contrast, in a situation in which persons are present in a scattered manner, the person detection unit 230 can detect the number of persons with higher accuracy than the crowd number-of-people estimation unit 240. Thus, in the present exemplary embodiment, the number of persons can be detected and estimated by an appropriate method in accordance with the determination result from the density determination unit 220. i.e. outputting the initial object count in response to the initial object count not exceeding the object count threshold (crowded image), if the image is not crowded then the persons detected in the non-crowded image are output). 
Therefore, it would have been obvious for one of ordinary skill in the art before the effective filing date to incorporate the teachings of Li et al (NPL: Density Map Guided Object Detection in Aerial Images) in view of Zhang et al (NPL: Finding Nonrigid Tiny Person With Densely Cropped and Local Attention Object Detector Networks in Low-Altitude Aeral Images) and Yano et al (US 20190333241 A1) so that the method comprises outputting the initial object count in response to the initial object count not exceeding the object count threshold. Doing so would allow the person detection unit 230 to detect the number of persons with higher accuracy than the crowd number-of-people estimation unit 240. Thus, in the present exemplary embodiment, the number of persons can be detected and estimated by an appropriate method in accordance with the determination result from the density determination unit 220 (Yano et al., Para 68).

Regarding claim 8 and 14, claim 8 and 14 rejected for the same reasons as claim 2 in the combination above.


Claim(s) 6, 12, and 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Li et al (NPL: Density Map Guided Object Detection in Aerial Images) in view of Zhang et al (NPL: Finding Nonrigid Tiny Person With Densely Cropped and Local Attention Object Detector Networks in Low-Altitude Aerial Images) and Deng et al (NPL: A Global-Local Self-Adaptive Network for Drone-View Object Detection).

Regarding claim 6, Li et al and Zhang et al do not teach the method of claim 1, wherein the first object counting model and the second object counting model are a same object counting model.
In a similar field of endeavor, Deng et al teaches the method of claim 1, wherein the first object counting model and the second object counting model are a same object counting model (Fig 2 and A. Global-Local Detection Network, as shown in Fig 3, the GLDN predicts object bounding boxes in two phases: global coarse detection on original images and local fine detection on the cropped sub-images. The global coarse detection predicts the outline clues on the down-sampled whole images. After cropping sub-images with SARSA and conducting super-resolution with LSRN, the local fine detection can predict more accurate results for refinement. The two-stage detection results are merged by NMS, getting the final optimal results. It should be mentioned that, the global coarse detection and local fine detection are implemented by the same detector, despite the scale divergence between the two-stage images, our GLDN is robust to accommodate this problem. i.e. the global and local detection paths, which can be considered the first (global object detector/counter which is used to estimate the objects in the entire first image and then afterwards a density map is generated) and second object counting model (local is cropped counting/detection model for high-density crops, occurs after first model), are performed using a first and second model that are a same object counting model ('implemented by the same detector')).
Therefore, it would have been obvious for one of ordinary skill in the art before the effective filing date to incorporate the teachings of Li et al (NPL: Density Map Guided Object Detection in Aerial Images) in view of Zhang et al (NPL: Finding Nonrigid Tiny Person With Densely Cropped and Local Attention Object Detector Networks in Low-Altitude Aerial Images) and Deng et al (NPL: A Global-Local Self-Adaptive Network for Drone-View Object Detection) so that the first object counting model and the second object counting model are a same object counting model. Doing so would allow for the method to coarsely detect the regions and objects confined by the original image first, and then continue to refine the detection for certain tiny-scale objects in an adaptive […] this can improve the robustness of scale-variant detection (Deng, I. Introduction).

Regarding claim 12 and 18, claim 12 and 18 rejected for the same reasons as claim 6 in the combination above.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
US-20220286599-A1
US-20210089816-A1
US-20200311440-A1
US-11461992-B2
C. Duan, Z. Wei, C. Zhang, S. Qu and H. Wang, "Coarse-grained Density Map Guided Object Detection in Aerial Images," in 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), Montreal, BC, Canada, 2021, pp. 2789-2798, doi: 10.1109/ICCVW54120.2021.00313. (Year: 2021)
Z. Ma, Lei Yu and A. B. Chan, "Small instance detection by integer programming on object density maps," 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, 2015, pp. 3689-3697, doi: 10.1109/CVPR.2015.7298992. (Year: 2015)


Any inquiry concerning this communication or earlier communications from the examiner should be directed to JACK PETER KRAYNAK whose telephone number is (703)756-1713. The examiner can normally be reached Monday - Friday 7:30 AM - 5 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vu Le can be reached at (571) 272-7332. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/JACK PETER KRAYNAK/Examiner, Art Unit 2668                                           
/UTPAL D SHAH/Primary Examiner, Art Unit 2668

Read full office action

Prosecution Timeline

Jan 03, 2024

Application Filed

Jan 16, 2026

Non-Final Rejection mailed — §103

Apr 22, 2026

Response Filed

Precedent Cases

Applications granted by this same examiner with similar technology

18/189,447

Patent 12639824

IMAGE PROCESSING METHOD AND APPARATUS FOR SEGMENTING IMAGES

3y 2m to grant Granted May 26, 2026

18/533,022

Patent 12632953

Semiconductor Molding System and Foreign Object Detection Method

2y 5m to grant Granted May 19, 2026

17/780,289

Patent 12626409

ENCODING AND DECODING VIEWS ON VOLUMETRIC IMAGE DATA

3y 11m to grant Granted May 12, 2026

17/744,733

Patent 12614264

NON-DESTRUCTIVE METHOD TO PREDICT SHELF LIFE AND MATURITY OF PERISHABLE COMMODITIES

3y 11m to grant Granted Apr 28, 2026

18/127,100

Patent 12608845

CAMERA HEALTH MONITORING AND ALERTING SYSTEM

3y 0m to grant Granted Apr 21, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

1-2

Expected OA Rounds

79%

Grant Probability

98%

With Interview (+19.3%)

2y 11m (~6m remaining)

Median Time to Grant

Low

PTA Risk

Based on 103 resolved cases by this examiner. Grant probability derived from career allowance rate.