Last updated: April 19, 2026

Application No. 18/197,427

IMAGE PERSPECTIVE RECTIFICATION SYSTEM

Non-Final OA §103§112

Filed

May 15, 2023

Examiner

RODRIGUEZ, ANTHONY JASON

Art Unit

2672

Tech Center

2600 — Communications

Assignee

Capital One Services LLC

OA Round

3 (Non-Final)

Interview Optional

— -21.4% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 18 resolved cases, 2023–2026

Examiner Intelligence

RODRIGUEZ, ANTHONY JASON View full profile →

Grants only 17% of cases

Career Allow Rate

3 granted / 18 resolved

-45.3% vs TC avg

Minimal -21% lift

Without

With

+-21.4%

Interview Lift

resolved cases with interview

Typical timeline

3y 2m

Avg Prosecution

47 currently pending

Career history

Total Applications

across all art units

Statute-Specific Performance

§101

22.1%

-17.9% vs TC avg

§103

43.4%

+3.4% vs TC avg

§102

16.1%

-23.9% vs TC avg

§112

18.3%

-21.7% vs TC avg

Black line = Tech Center average estimate • Based on career data from 18 resolved cases

Office Action

§103 §112

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 01/16/2026 has been entered.

Response to Arguments
Applicant’s arguments, see Remarks pages 7-9, filed 01/16/2026, with respect to the rejections of amended claim(s) 1, 10, and 19 under 35 U.S.C. 103 have been fully considered and are moot in view of the new grounds of rejection (detailed in the rejections below) necessitated by Applicant’s amendment to the claim(s).


Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Regarding claim 1, the limitation “determining whether the plurality of transformation parameters exceed at least one readable threshold, wherein the readable threshold is a number of degrees askew from an axis,” is regarded as indefinite because it is unclear whether the “at least one readable threshold” consists of only “a number of degrees askew from an axis,” or comprises a plurality of readable thresholds which include “a number of degrees askew from an axis.” For the purposes of examination, the limitation is interpreted as “determining whether the plurality of transformation parameters exceed at least one readable threshold, wherein the readable thresholds include a number of degrees askew from an axis.”
Regarding claims 2-9, it/they is/are rejected under 112b for inheriting and failing to cure the deficiencies of the parent claim 1.
As per claim(s) 10 & 19, arguments made in rejecting claim(s) 1 are analogous. 
Regarding claims 11-18 and 20, it/they is/are rejected under 112b for inheriting and failing to cure the deficiencies of the parent claims 10 & 19, respectively.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Migdal et al. (US 12,131,224 B1) hereinafter referenced as Migdal, in view of Hansen et al. (Real-Time Barcode Detection and Classification using Deep Learning) hereinafter referenced as Hansen.
Regarding claim 1, Migdal discloses: A computer-implemented method, comprising: receiving an image of a physical object comprising graphical data having an undefined perspective distortion (Migdal: Col 2: Lines 47-58: “embodiments described herein provide improved systems and techniques for reading structured identifiers on items using camera-based scanning systems…the structured identifier within the image that is captured via a camera-based scanning system may be non-decodable due to one or more image defects (e.g., image noise, blurriness, low sharpness, low lighting, low contrast, compression artifacts, etc.).”; 
Col 7: Lines 39-42: “…geometric transformations may be used to rectify the shape of a structured identifier due to perspective and lens distortion, package deformation, structured identifier wrinkling, etc.”; Wherein the structured identifier constitutes graphical data and distorted shape of the identifier constitutes the perspective distortion);
determining that an image transformation is to be performed on the received image (Migdal: Figure 4; Col 12: Lines 38-47: “…the analysis tool 322 may provide the structured identifier images 336 to the decoding tool 326. If the decoding tool 326 is able to successfully decode a structured identifier from the structured identifier images 336, then the enhancement tool 324 may not be applied or implemented. On the other hand, if the decoding tool 326 is unable to successfully decode a structured identifier from the structured identifier images 336, then the decoding tool 326 can send the structured identifier images 336 to the enhancement tool 324 for an enhancement process.”);
providing the received image to a neural network configured to generate a plurality of transformation parameters corresponding to the perspective distortion (Migdal: Figure 4; Col 11: Lines 7-9: “ the image restoration model(s) 342 may be configured to use deep learning-based image restoration as an image enhancement technique.”; 
Col 11-12: Lines 66-05: “ the enhancement tool 324 can be configured to selectively apply different image enhancement techniques (via the image restoration model(s) 342 ) based on the type of image defect (or corruption). For example, if the detected structured identifier is heavily distorted, then the enhancement tool 324 may apply geometric transforms to rectify the structured identifier.”; Wherein the deep learning models extract transformation parameters used to apply transformations to rectify defects within the identifier), 
wherein the neural network was trained on a set of training data comprising: a first set of images including objects that are the same category of physical object as the received image, in a readable orientation; and a second set of images with a predefined perspective transformation and predefined parameters, corresponding to the plurality of transformation parameters, for each of the second set of images (Migdal: Figure 7; Col 11: Lines 32-49: “The low quality counterparts to the cropped regions may be created using a variety of image augmentation transforms, including, but not limited to, focal and motion blur, JPEG artifacts, shutter effects, brightness, contrast, and other image processing corruption methods…The high quality images 702 1 - 4 are high quality crops taken from imagery consisting of a mixture of structured identifiers and other regions…The low quality images 704 1 - 4 are generated by using an image augmentation transform on the high quality images 702 1 - 4, respectively.”; Wherein the high quality barcode images constitute the first set of images, and the low quality barcode images constitute the second set of images. Also, wherein the augmented first set image data constitutes the predetermined parameters.);
performing the image transformation on the received image to generate a rectified image comprising the graphical data oriented within a readable threshold (Migdal: Figure 4; Col 7-8: Lines 43-02: “In one embodiment, at least one image restoration model 342 is configured to perform a deep learning-based image restoration to enhance the structured identifier image(s) 336…The decoding tool 326 is generally configured to extract information associated with the structured identifier 175 (referred to herein as structured identifier information 340 )…the decoding tool 326 extracts information associated with the structured identifier 175 from the enhanced structured identifier images 338.”; Wherein the image restoration models extract the transformation parameters from the identifier in the image and rectifies the image using them, then is able to read the identifiers using the decoding tool).
Migdal does not disclose expressly: receiving, from the neural network, values for the plurality of transformation parameters of the received image; determining whether the plurality of transformation parameters exceed at least one readable threshold, wherein the readable threshold is a number of degrees askew from an axis; and performing, in response to determining that the plurality of transformation parameters exceed the at least one readable threshold, the image transformation on the received image, based on the values for the plurality of transformation parameters, to generate a rectified image comprising the graphical data oriented within the at least one readable threshold.
Thus, Migdal does not disclose expressly: the explicit thresholding of the plurality of transformation parameters based on at least one readable threshold, wherein the readable threshold is the number of degrees the image is askew from an axis. A rectified image, comprising the barcode oriented within the at least one readable threshold, is then generated based on at least one readable threshold.
	Hansen discloses: receiving, from a neural network, values for transformation parameters of a received image; determining whether the transformation parameters exceed at least one readable threshold, wherein the readable threshold is a number of degrees askew from an axis (Hansen: Figure 1; Section: 3 OUR APPROACH: “we found out that rotating the 1D barcodes such that the bars are vertical and rotating QR barcodes so that the sides of the small squares align with the x and y axis can benefit the performance of the decoding. For 1D barcodes, there is a speedup in time and a higher decoding rate, whereas for the QR barcodes the decoding will take longer, but the decoding success rate is higher. To find the amount of rotation needed a regression network is used to predict a rotation value between 0 and 1. The value will be mapped to an angle going from 0 to 180 for 1D and 45 and 135 for QR barcodes. At fig. 2, the method on how the angle is measured is shown.”; Wherein the determination of a rotation angle needed, such that it is determined that the barcode requires rotation, constitutes an exceeded threshold.); and 
performing, in response to determining that the transformation parameters exceed the at least one readable threshold, the image transformation on the received image, based on the values for the transformation parameters, to generate a rectified image comprising the graphical data oriented within the at least one readable threshold (Hansen: Figure 1; Section: 3 OUR APPROACH: “Each of these barcodes is then put through the Angle prediction network which predicts a rotation and the predicted rotation is then used to rotate the image before it is tried decoded by a decoding framework”; Wherein the image is rectified by rotating the image based on a predicted rotation).
Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to implement the Angle prediction network disclosed by Hansen into the image enhancement tool disclosed by Migdal prior to selectively applying the different image enhancement techniques. The suggestion/motivation for doing so would have been “The developed system can also find the rotation of both the 1D and QR barcodes, which gives the opportunity of rotating the detection accordingly which is shown to benefit the decoding process in a positive way. Both the detection and the rotation prediction shows real-time performance.” (Hansen: Abstract.). Further, one skilled in the art could have combined the elements as described above by known methods with no change in their respective functions, and the combination would have yielded nothing more than predictable results.
In addition, before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to implement the known technique of detecting the rotation angle of an image through a neural network prior to performing image rectification based on a selected value further taught by Hansen by thresholding the image defect values determined by the image restoration model(s) disclosed by Migdal in view of Hansen prior to selectively applying the different image enhancement techniques. The suggestion/motivation for doing so would have been “Through some small scale test, we found out that rotating the 1D barcodes such that the bars are vertical and rotating QR barcodes so that the sides of the small squares align with the x and y axis can benefit the performance of the decoding.” (Hansen: 3 OUR APPROACH; Wherein the selection of values for parameters allows for adjustment for improved performance.). Further, one skilled in the art could have combined the elements as described above by known methods with no change in their respective functions, and the combination would have yielded nothing more than predictable results. Therefore, it would have been obvious to combine Migdal with Hansen with the further teaching of Hansen to obtain the invention as specified in claim 1.

Regarding claim 2, Migdal in view of Hansen discloses: The computer-implemented method of claim 1, further comprising: providing the rectified image to a system configured to read the graphical data oriented within the at least one readable threshold (Migdal: Figure 4; Col 12: Lines 10-15: “As shown in FIG. 4, the enhancement tool 324 provides the enhanced structured identifier images 338 to the decoding tool 326, which is configured to determine structured identifier information 340, based on reading the structured identifier(s) 175 in the enhanced structured identifier images 338.”).

Regarding claim 3, Migdal in view of Hansen discloses: The computer-implemented method of claim 1, wherein the neural network is configured to perform pixel analysis on pixels of the graphical data of the image (Hansen: Section: 3 OUR APPROACH: “At fig. 2, the method on how the angle is measured is shown. The regression network is based on the Darknet19 classification network1 where the softmax layer is removed, and the number of filters in the last convolutional layer is set to one. Furthermore, three different activation functions are tried in the last convolutional as well, Leaky ReLU, Logistic and ReLU.”)
(Migdal: Col 11: Lines 50-56: “To train the deep learning-based model, the images 702 and 704 are fed in pairs consisting of the high quality cropped original (e.g., high quality image 702) and its low
quality corrupted counterpart (e.g., low quality image 704). The output is a model trained to take as input a low quality, possibly corrupted image, and reconstruct a high quality representation of the image.”).

Regarding claim 4, Migdal in view of Hansen discloses: The computer-implemented method of claim 3, wherein the graphical data comprises one of a barcode or a QR (quick response) code (Migdal: Col 2: Lines 8-13: “Such a computer-vision system can use the camera-based structured identifier reader(s) to detect and read (or decode) structured identifiers (e.g., linear barcodes, one-dimensional (1D) barcodes, two-dimensional (2D) barcodes, etc.) on various items (or packages) in the facility.”).

Regarding claim 5, Migdal in view of Hansen discloses: The computer-implemented method of claim 4, wherein the barcode comprises parallel lines (Migdal: Figures 6A-6G and 7; Wherein the barcodes contain parallel lines).

Regarding claim 6, Migdal in view of Hansen discloses: The computer-implemented method of claim 1, further comprising: identifying a plurality of graphical objects on the received image  (Migdal: Col 6: Lines 41-45: “In a particular embodiment, the detector tool 320 uses a ML model (e.g., ML-based localizer model) to determine the bounding box of each structured identifier 175 within a respective image 330.”; Wherein determining a bounding box for each identifier within an image means that multiple identifiers may be present in an image); selecting a first graphical object from the plurality of graphical objects; and providing the first graphical object to the neural network, wherein the performing comprises performing the image transformation on the plurality of graphical objects based on the plurality of transformation parameters of the first graphical object (Hansen: Section: 3 OUR APPROACH: “The system first receives an input image, and then it is fed through the YOLO detection system which produces a number of detections depending on the number of barcodes in the image. Each of these barcodes is then put through the Angle prediction network which predicts a rotation and the predicted rotation is then used to rotate the image before it is tried decoded by a decoding framework.”)
(Migdal: Figure 4; Col 11-12: Lines 66-05: “the enhancement tool 324 can be configured to selectively apply different image enhancement techniques (via the image restoration model(s) 342 ) based on the type of image defect (or corruption). For example, if the detected structured identifier is heavily distorted, then the enhancement tool 324 may apply geometric transforms to rectify the structured identifier.”; Wherein the DCT CNN defect detector outputs identified defects within the structural identifier image on a patch by patch basis, and the enhancement tool applies image restoration models based on the identified distortions.).

Regarding claim 7, Migdal in view of Hansen discloses: The computer-implemented method of claim 1, further comprising providing, in response to determining that the plurality of transformation parameters do not exceed the at least one readable threshold, the received image to an object reader without performing the transformation (Migdal: Figure 4; Col 11-12: Lines 66-05: “the enhancement tool 324 can be configured to selectively apply different image enhancement techniques (via the image restoration model(s) 342 ) based on the type of image defect (or corruption). For example, if the detected structured identifier is heavily distorted, then the enhancement tool 324 may apply geometric transforms to rectify the structured identifier.”; Wherein the image enhancement techniques are applied based on corruptions/defects detected.).
Regarding claim 8, Migdal in view of Hansen discloses: The computer-implemented method of claim 1, wherein the graphical data comprises alphanumeric text with a known pattern  (Migdal: Figures 6A-6G; Col 3: Lines 11-16: “the structured identifier can include any type of globally unique identifier used to identify an item. Examples of such structured identifiers can include visual and/or geometric features of a label (e.g., ridges, edges, pixel value intensity changes), text, 1D barcodes, two-dimensional (2D) barcodes, etc.”; 
Col 7: Lines 56-61: “The structured identifier information 340 may include symbology used for the structured identifier 175, the detected corners of the structured identifier 175, the decoded text or binary information associated with the structured identifier 175 (e.g., the alphanumeric string or binary information), or combinations thereof.”; Wherein the alphanumeric string/text constitutes an alphanumeric text with a known pattern).
Regarding claim 9, Migdal in view of Hansen discloses: The computer-implemented method of claim 1, wherein the second set of images includes at least a subset of the first set of images re-oriented with the known perspective transformation (Migdal: Col 11: Lines 25-36: “the deep learning-based model may be trained by creating datasets consisting of high quality original images and low quality corrupted images…The low quality counterparts to the cropped regions may be created using a variety of image augmentation transforms, including, but not limited to, focal and motion blur, JPEG artifacts, shutter effects, brightness, contrast, and other image processing corruption methods.”; 
Col 11: Lines 47-49: “The low quality images 704 1 - 4 are generated by using an image augmentation transform on the high quality images 702 1 - 4 , respectively.”).
As per claim(s) 10, arguments made in rejecting claim(s) 1 are analogous. In addition, Figure 3 and Col 15-16: Lines 65-20 of Migdal disclose at least one processor coupled to memory. 
As per claim(s) 11, arguments made in rejecting claim(s) 2 are analogous.
As per claim(s) 12, arguments made in rejecting claim(s) 3 are analogous.
As per claim(s) 13, arguments made in rejecting claim(s) 4 are analogous.
As per claim(s) 14, arguments made in rejecting claim(s) 5 are analogous.
As per claim(s) 15, arguments made in rejecting claim(s) 6 are analogous.
As per claim(s) 16, arguments made in rejecting claim(s) 7 are analogous.
As per claim(s) 17, arguments made in rejecting claim(s) 8 are analogous.
As per claim(s) 18, arguments made in rejecting claim(s) 9 are analogous.
As per claim(s) 19, arguments made in rejecting claim(s) 1 are analogous. In addition, Col 15-16: Lines 65-20 of Migdal disclose a non-transitory computer-readable medium having instructions stored thereon that are executed by a processor.
As per claim(s) 20, arguments made in rejecting claim(s) 2 are analogous.


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ANTHONY J RODRIGUEZ whose telephone number is (703)756-5821. The examiner can normally be reached Monday-Friday 10am-7pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Sumati Lefkowitz can be reached at (571) 272-3638. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/ANTHONY J RODRIGUEZ/Examiner, Art Unit 2672



/SUMATI LEFKOWITZ/Supervisory Patent Examiner, Art Unit 2672

Read full office action

Prosecution Timeline

May 15, 2023

Application Filed

May 29, 2025

Non-Final Rejection — §103, §112

Aug 27, 2025

Examiner Interview Summary

Aug 27, 2025

Applicant Interview (Telephonic)

Sep 05, 2025

Response Filed

Nov 06, 2025

Final Rejection — §103, §112

Jan 16, 2026

Request for Continued Examination

Jan 27, 2026

Response after Non-Final Action

Mar 09, 2026

Non-Final Rejection — §103, §112 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

17/972,931

Patent 12499701

DOCUMENT CLASSIFICATION METHOD AND DOCUMENT CLASSIFICATION DEVICE

2y 5m to grant Granted Dec 16, 2025

17/897,121

Patent 12488563

Hub Image Retrieval Method and Device

2y 5m to grant Granted Dec 02, 2025

17/847,222

Patent 12444019

IMAGE PROCESSING APPARATUS, IMAGE PROCESSING METHOD, AND MEDIUM

2y 5m to grant Granted Oct 14, 2025

Study what changed to get past this examiner. Based on 3 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

3-4

Expected OA Rounds

17%

Grant Probability

-5%

With Interview (-21.4%)

3y 2m

Median Time to Grant

High

PTA Risk

Based on 18 resolved cases by this examiner. Grant probability derived from career allow rate.