Prosecution Insights
Last updated: April 18, 2026
Application No. 18/358,506

MACHINE LEARNING-BASED TEXT RECOGNITION SYSTEM WITH FINE-TUNING MODEL

Non-Final OA §103
Filed
Jul 25, 2023
Examiner
CADY, MATTHEW ALAN
Art Unit
2145
Tech Center
2100 — Computer Architecture & Software
Assignee
Hyper Labs Inc.
OA Round
1 (Non-Final)
Grant Probability
Favorable
1-2
OA Rounds
3y 3m
To Grant

Examiner Intelligence

Grants only 0% of cases
0%
Career Allow Rate
0 granted / 0 resolved
-55.0% vs TC avg
Minimal +0% lift
Without
With
+0.0%
Interview Lift
resolved cases with interview
Typical timeline
3y 3m
Avg Prosecution
11 currently pending
Career history
11
Total Applications
across all art units

Statute-Specific Performance

§101
24.3%
-15.7% vs TC avg
§103
43.2%
+3.2% vs TC avg
§102
13.5%
-26.5% vs TC avg
§112
18.9%
-21.1% vs TC avg
Black line = Tech Center average estimate • Based on career data from 0 resolved cases

Office Action

§103
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA. Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claim (s) 2-3, 5-7, 9-21 is/are rejected under 35 U.S.C. 103 as being unpatentable over Richard J. Becker et al (hereinafter Becker) ( US 10621727 B1 , 2020-04-14) in view of Nirmit V. Desai et al. (hereinafter Desai) ( US 11164108 B2 , 2021-11-02) . Regarding claim 2 , Becker teaches; receiving, at a processor of a first compute device, a set of field images, each field image from the set of field images associated with a part of a document from a set of documents; ( [col. 8, lines 48-51] At block 602, the processors identify an image of a form. The image may have been taken using a digital camera or a scanner. The form may be, for example, a tax form such as a W-2, a 1099-MISC, a 1098-T, or a 1040. ) NOTE: Teaches a set of documents ( forms including a W-2, a 1099-MISC, a 1098-T, or a 1040 ). [fig. 3] ( [col. 6 lines 41-45] FIG. 3 is a more detailed view of the image segmenter 202, according to one embodiment. As shown, the digital image 308 is an image of a W2 tax form. The image segmenter 202 segments the digital image 308 using both a line segmenter 302 and a paragraph segmenter 304. ) ([col. 10 lines 11-13] At block 626, the processors store image segments, instances, classifications, and extracted text from blocks 608-624 in a data store. ) NOTE: Teaches receiving (obtaining the image segments via the disclosed process is considered receiving) a set of field images ( image segments; 310, 312, 314, 316, 318 , which each represent fields of the documents, and are therefore considered field images ) , each field image from the set of field images associated with a part of a document (310 associated with box 1 of W-2 form , 312 associated with box 2 of W-2 form , etc.) from the set of documents (W-2 was one of the aforementioned documents from the set of documents). ([col. 6 lines 60-61] At block 604, the processors segment the image of the form using multiple segmentation methods. ) NOTE: Discloses processors (which includes at least a processor of a first compute device) for the aforementioned method of recieving the field images (image segments), therefore teaching receiving the aforementioned field images at a processor of a first compute device. and generating a second machine learning model, via the processor, based on (1) the first set machine learning model and (2) the set of field images, the second machine learning model configured to identify document parts associated with the set of documents. ( [col. 7 lines 20-31] FIG. 5 illustrates an example of training the segment classifier 206 to classify image segments without OCR. As shown, the segment classifier 206 includes a machine-learning model 506 … Training Data 108 can include training image segments 502. The training image segments 502 can include image segments that have been assigned verified classifications. For example, the training image segments 502 can comprise image segments that have been classified as box 1 fields from images of W-2 tax forms. ) NOTE: Teaches generating (training a machine learning model is considered generating, because training is the act that produces the operative trained version of the model ) a machine learning model (the machine learning model of the segment classifier) based on the field images (the training data can include the training image segments, such as an image representing a field from the W-2 forms) [fig. 5] ( [col. 11 lines 36-44] The image segmenter 202 can identify and separate image segments that are found within the digital image. The feature extractor 204 can, for each image segment, extract or detect a set of respective features. The segment classifier 206 can use the set of respective features for each image segment to assign a classification for the image segment. The classification may associate the image segment with a certain field type or label type. ) NOTE: Teaches the machine learning model identifying document parts associated with the set of documents (the classification generated by the machine learning model of the segment classifier identifies the field type or label type [which are parts of the document] of the input image segment, which is considered identifying document parts . ) Becker fails to teach but Desai teaches; receiving, at the processor and from a second compute device remote from the first compute device, a first machine learning model; ( [col. 13 lines 21-27] The hardware in FIGS. 1-2 may vary depending on the implementation. Other internal hardware or peripheral devices, such as flash memory, equivalent non-volatile memory, or optical disk drives and the like, may be used in addition to or in place of the hardware depicted in FIGS. 1-2. In addition, the processes of the illustrative embodiments may be applied to a multiprocessor data processing system. ) NOTE: Teaches that the above compute devices (each of the above devices are capable of performing computations, and are therefore considered compute devices) may include processors for performing the processes of which they are capable of . ( [col. 11 lines 1-4] Application 105 implements an embodiment described herein. Application 105 trains base model 103 using public data 109 and delivers trained base model to nodes such as device 132 and client 114. ) NOTE: Teaches receiving, at the processor (as previously mentioned, the embodiments of fig. 1 may be include processors, such as device 132, which is a smart phone) and from a second compute device remote to the first compute device (device 132 can be considered the first compute device, and application 105 can be considered the second compute device, where the second compute device is remote to the first compute device, as shown in fig. 1), a first machine learning model (application 105 delivers the trained base model to device 132 , i.e., the processor of the first device 132 receives a first machine learning model from the second compute device 105 ). and generating a second machine learning model, via the processor, based on (1) the first set machine learning model ( [col. 11 lines 1-19] Application 105 implements an embodiment described herein. Application 105 trains base model 103 using public data 109 and delivers trained base model to nodes such as device 132 and client 114. Application 105 causes node 132 to perform transfer learning to produce personalized model 134 using local data 136 … In one embodiment, a pplication 105 delivers program code to nodes 132 and 114, which causes nodes 132 and 114 to perform the operations described herein. In another embodiment, application 105 delivers commands to nodes 132 and 114, which causes nodes 132 and 114 to perform the operations described herein. ) NOTE: Teaches generating a second machine learning model (personalized model), via the processor, (the aforementioned processor of device 132 , capable of performing the processes of device 132 ), based on the first set machine learning model (performs transfer learning to produce the personalized model based on the trained base model and local data) OBVIOUSNESS TO COMBINE DESAI WITH BECKER: Desai and Becker are both analogous art to each other and to the present disclosure. Desai pertains to a method of training a plurality of personalized machine learning models based on a trained base model, while Becker pertains to a method of identifying fields and labels of a digital image of a form using machine learning. Additionally, Desai states; ( [col. 1 lines 56-67] A domain of a model represents key characteristics of the environment in which the data to train the model is generated. One such characteristic of a domain is a probability distribution of the training data used to train the model. Typically, a model's output in a deployment configuration is most accurate when its input comes from the same domain as the domain of the training data in the training configuration. For example, a model trained with data from consumer vehicles of a certain manufacturer make and model operating in a geographical region, such as a particular city, will perform the best when applied to vehicles and regions similar to those corresponding to the training data. ) NOTE: This excerpt details that a model performs the best when trained on data of its specific domain or environment. From this, the purpose of training a second model from a first based model received from a remote computing device would be to tailor the second model to its specific domain, thus improving its performance. In the context of a model for recognizing document (form) parts as taught by Becker, a base model trained on , for example, United Kingdom tax forms , would have a poor accuracy when utilized in the domain of other countries tax forms, such as United S tates tax forms. The method of Desai could be used to generate a second model specific to United States tax forms based on the base mode l (UK tax form model) and United States tax form specific field images , to improve performance in the specified domain . Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to generate a second model based on a first based model (as taught by Desai) for identifying document parts (as taught by Becker) to allow for improved identification in more specialized tasks. Regarding claim 3, Becker in view of Desai teach; The method of claim 2 (Using the same reasoning as claim 2) Becker teaches; Field images (Using the same reasoning as claim 2) Becker fails to teach but Desai teaches; wherein the second compute device is associated with an entity that does not have access to data of the first compute device, [col 4] NOTE: Figure 1 and the following excerpt of column 4 t each that the second compute device (aforementioned application 105) is associated with an entity (client 114, for example) that does not have access to data of the first compute device (the aforementioned first compute device 132 contains private /local data, which based on the excerpt can be inaccessible to other entities in certain implementations , which can include client 114 , for example ). and the set of field images [data] includes data specific to the entity. ( It is possible for a node to export the local data to another node, but in such a case, the node that receives the exported data (and may also be capable of training a model for another node using the exported data) is also regarded as simply a server. ) NOTE: It is possible for a node (such as client 114) to export local data (data specific to client 114) to another node (such as the aforementioned first compute device, 132) wherein the node that receives the exported data may use that data to train a model. This therefore teaches the set of data received by the first compute device includes data specific to the entity. O ne of ordinary skill in the art, before the effective filing date of the claimed invention, would have been able to substitute the ‘data’ in the above teaching with the ‘field images’ of claim 2 (taught by Becker) to obtain predictable results , as it is a simple substitution of machine learning input data . Based on this substitution, it would then be obvious for the set of field images of claim 1 to include data specific to the aforementioned entity. Regarding claim 5, Becker in view of Desai teach; The method of claim 2 (Using the same reasoning as claim 2) Becker teaches; executing the second machine learning model to identify the document parts associated with the set of documents without any modifications to the second machine learning model. ([col. 4 lines 52-59] In some embodiments, an image segment may be classified as a field that contains a specific type of information. This classification can be used to identify a subset of textual characters that may be depicted in the image segment. For example, if an image segment that has been classified as a field for a social security number (e.g., "box a" of W-2 form), the subset of textual characters may include digits and dashes and exclude letters. ) NOTE: Teaches identifying (classifying is considered identifying) image segments as a particular part of a document ( field of a form , such as ‘box a’ of a W-2 form ) associated with the set of documents. ( [col. 10 lines 52-55] At block 708, the processors assign a classification to the input instance using the machine-learning model. The classification can associate the input instance with a field type or a label type. In some examples, the classification and the image segment can be provided for user inspection on a display. If the classification is erroneous, the user can provide feedback indicating a corrected classification. After this feedback is 60 received, a training instance can be created for the machine learning model. The training instance comprises the plurality of features and the corrected classification. ) NOTE: Teaches executing the machine learning model to identify the document parts associated with the set of documents (the aforementioned classification is performed by executing the machine learning model) without any modifications to the machine learning model ( U ser feedback and creation of a training instance occur only after the classification is produced and found erroneous. Thus, the reference at least suggests that the model is first executed to identify the relevant field/label before any subsequent feedback-based retraining or modification of the model) . Becker fails to teach but Desai teaches; The second machine learning model (using the same teaching from claim 2) Using the reasoning from claim 2 , it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, for the second machine learning model taught by Desai to be the model taught by Becker . Regarding claim 6, Becker in view of Desai teach; The method of claim 2 (Using the same reasoning as claim 2) Becker teaches ; wherein the document parts include parts of one or more documents from the set of documents that contain a predefined type of data. ( [col. 9 lines 46-60] At block 618, the processors define a character space for the image segment based on the classification that was assigned by the one or more machine- learning models. In one example, if the classification indicates that the image segment is "box a" or "box b" from a W-2 form, the character space for the image segment can be defined as the digits 0-9 and the hyphen character. In another example, if the classification indicates that the image segment is "box l" of a W2 form, the character space for the image segment can be defined as the digits 0-9, the comma character, and the period character. In another example, if the classification indicates that the image segment is a field for a middle initial, the character space for the image segment can be defined as all capital and lower-case letters and the period character. ) NOTE: Teaches the document parts (box a, box b) including parts of one or more documents from the set of documents (W-2 form from the aforementioned set of documents) that contain a predefined type of data (the aforementioned document parts contain specific digits, specific characters, etc. , which are considered predefined types of data ). Regarding claim 7, Becker in view of Desai teach; The method of claim 2 (Using the same reasoning as claim 2) where each document from the set of documents includes at least one of handwritten text or typewritten text. ( [col. 8 lines 48-58] At block 602, the processors identify an image of a form. The image may have been taken using a digital camera or a scanner. The form may be, for example, a tax form such as a W-2, a 1099-MISC, a 1098-T, or a 1040. The form may have been printed on paper before the image was taken. The image may be in a raster format such as Joint photographic Experts Group (JPEG), Tapped Image File Format (TIFF), Graphics Interchange Format (GIF), Bitmap (BMP), or Portable Network Graphics (PNG). Alternatively, the image may be in a vector format such as Computer Graphics Metafile (CGM) or Scalable Vector Graphics (SVG). ) NOTE: Discloses the documents of the disclosure being images of forms. A scanned form plainly teaches text that is at least typewritten/printed, and often also handwritten. Forms contain preprinted text such as labels, headers, and field names, which constitutes typewritten text. And where the form includes filled-in entries, signatures, initials, or other user-added content, the form includes handwritten text. This therefore teaches the documents containing handwritten or typewritten text. Regarding claim 9, Becker in view of Desai teach; The method of claim 2 (Using the same reasoning as claim 2) Becker fails to teach but Desai teaches; wherein all parameters of the first machine learning model are received during the receiving of the first machine learning model. ([col. 5 lines 63-67, col. 6 lines 1-3] A model parameter is a value associated with a characteristic of a model. For example, if the model is a neural network, the model parameters may be a set of weights corresponding to a set of neural network nodes in the model. As another example, if the model is a different type of algorithm, a set of model parameters may be set of constants, variables, or some combination thereof, used in one or more functions or computations in the algorithm. ) NOTE: A machine learning model (such as the first machine learning model taught by Desai) has a set of parameters. ( Application 105 implements an embodiment described herein. Application 105 trains base model 103 using public data 109 and delivers trained base model to nodes such as device 132 and client 114. ) NOTE: The first machine learning model (and therefore all of its corresponding parameters) is received by the first compute device (device 132). Regarding claim 10; Claim 10 is an apparatus claim directly corresponding to claim 2, and is therefore rejected using the same reasoning. Regarding claim 11; Claim 11 is an apparatus claim directly corresponding to claim 3, and is therefore rejected using the same reasoning. Regarding claim 12; Claim 12 is an apparatus claim directly corresponding to claim 5, and is therefore rejected using the same reasoning. Regarding claim 13; Claim 13 is an apparatus claim directly corresponding to claim 6, and is therefore rejected using the same reasoning. Regarding claim 14; Claim 14 is an apparatus claim directly corresponding to claim 7, and is therefore rejected using the same reasoning. Regarding claim 15; Claim 15 is an apparatus claim directly corresponding to claim 9, and is therefore rejected using the same reasoning. Regarding claim 16, Becker in view of Desai teach; The method of claim 2 (Using the same reasoning as claim 2) wherein the second machine learning model includes a decision tree. ([col. 7 lines 40-49] The training instances 504 can be used to train and refine 40 the machine-learning model 506. There are different types of inductive and transductive ma chine-learning models that can be used for the machine-learning model 506. Examples of machine-learning models include adsorption models, neural networks, support vector machines, radial basis functions, Bayesian belief networks, association-rule models, decision trees, ) NOTE: Teaches that the machine learning model of Becker can be a decision tree. Becker fails to teach but Desai teaches; the second machine learning model (Using the same reasoning from claim 2) Using the same reasoning from claim 2, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, for the second machine learning model taught by Desai to be the model taught by Becker. Regarding claim 17; Claim 17 is a computer readable medium claim corresponding to claim 2 , and is therefore rejected using the same reasoning, with the additional limitation of; receive, from a second compute device that is remote from the first compute device, all parameters of a first machine learning model; Which is taught by Desai; ([col. 5 lines 63-67, col. 6 lines 1-3] A model parameter is a value associated with a characteristic of a model. For example, if the model is a neural network, the model parameters may be a set of weights corresponding to a set of neural network nodes in the model. As another example, if the model is a different type of algorithm, a set of model parameters may be set of constants, variables, or some combination thereof, used in one or more functions or computations in the algorithm. ) NOTE: A machine learning model (such as the first machine learning model taught by Desai) has a set of parameters. ( Application 105 implements an embodiment described herein. Application 105 trains base model 103 using public data 109 and delivers trained base model to nodes such as device 132 and client 114. ) NOTE: The first machine learning model (and therefore all of its corresponding parameters) is received by the first compute device (device 132) from the second compute device (application 105) which is remote to the first compute device (as previously taught in claim 2) . Regarding claim 18; Claim 18 is a computer readable medium claim corresponding to claim 6, and is therefore rejected using the same reasoning, with the additional limitation of; the predefined type of data including at least one of a signature or a text entry. Which is taught by Becker; ( [col. 9 lines 46-60] At block 618, the processors define a character space for the image segment based on the classification that was assigned by the one or more machine-learning models. In one example, if the classification indicates that the image segment is "box a" or "box b" from a W-2 form, the character space for the image segment can be defined as the digits 0-9 and the hyphen character. In another example, if the classification indicates that the image segment is "box l" of a W2 form, the character space for the image segment can be defined as the digits 0-9, the comma character, and the period character. In another example, if the classification indicates that the image segment is a field for a middle initial, the character space for the image segment can be defined as all capital and lower-case letters and the period character. ) NOTE: Teaches the predefined type of data including at least a text entry (a field for digits, a field for a middle initial, etc.) Regarding claim 19; Claim 19 is a computer readable medium claim directly corresponding to claim 16, and is therefore rejected using the same reasoning. Regarding claim 20, Becker in view of Desai teaches; The non-transitory processor-readable medium of claim 17, (Using the teaching from claim 17) Becker teaches; where each document from the set of documents includes at least one of printed text or handwritten text. ( [col. 8 lines 48-58] At block 602, the processors identify an image of a form. The image may have been taken using a digital camera or a scanner. The form may be, for example, a tax form such as a W-2, a 1099-MISC, a 1098-T, or a 1040. The form may have been printed on paper before the image was taken. The image may be in a raster format such as Joint photographic Experts Group (JPEG), Tapped Image File Format (TIFF), Graphics Interchange Format (GIF), Bitmap (BMP), or Portable Network Graphics (PNG). Alternatively, the image may be in a vector format such as Computer Graphics Metafile (CGM) or Scalable Vector Graphics (SVG). ) NOTE: Discloses the documents of the disclosure being images of forms (which may have been printed) . A scanned form plainly teaches text that is at least typewritten / printed, and often also handwritten. Forms contain preprinted text such as labels, headers, and field names, which constitutes typewritten /printed text. And where the form includes filled-in entries, signatures, initials, or other user-added content, the form includes handwritten text. This therefore teaches the documents containing handwritten or printed text. Regarding claim 21; Claim 21 is a computer readable medium claim directly corresponding to claim 3, and is therefore rejected using the same reasoning. Claim (s) 4 is/are rejected under 35 U.S.C. 103 as being unpatentable over Becker ( US 10621727 B1 , 2020-04-14) in view of Desai ( US 11164108 B2 , 2021-11-02), further in view of Li, Jinyu et al. (hereinafter Li) ( US 20160078339 A1 , 2 016-03-17 ) . Regarding claim 4, Becker in view of Desai teaches; The method of claim 2, further comprising: (Using the same reasoning from claim 2) Becker and Desai fail to teach but Li teaches; executing the first machine learning model and the second machine learning model. ([Abstract] Systems and methods are provided for generating a DNN classifier by “learning” a “student” DNN model from a larger more accurate “teacher” DNN model. The student DNN may be trained from un-labeled training data because its supervised signal is obtained by passing the un-labeled training data through the teacher DNN. In one embodiment, an iterative process is applied to train the student DNN by minimize the divergence of the output distributions from the teacher and student DNN models. ) NOTE: Teaches a second model (student model) based on a first model (teacher model), and teaches executing the first and second model (output distributions of the teacher and student models indicates that they have been executed). OBVIOUSNESS TO COMBINE LI WITH BECKER AND DESAI: Li is analogous art to Becker, Desai, and the present disclosure as they all pertain to methods of machine learning. Specifically, Li pertains to generating a DNN classifier by learning a student model from a more accurate teacher model. Additionally, Li states; ([Abstract] In one embodiment, an iterative process is applied to train the student DNN by minimize the divergence of the output distributions from the teacher and student DNN models. For each iteration until convergence, the difference in the output distributions is used to update the student DNN model, and output distributions are determined again, using the unlabeled training data. The resulting trained student model may be suitable for providing accurate signal processing applications on devices having limited computational or storage resources such as mobile or wearable devices. ) NOTE: This excerpt discloses that by executing both models to use their respective output distributions to minimize divergence between the two models, the generated student model is smaller and still performs accurately, which is more suitable for resource constrained devices. Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to execute both the first and second model of claim 2 (taught by Becker in view of Desai) to use their outputs to minimize divergence between the models (as taught by Li), resulting in an accurate model suitable for resource constrained environments Claim (s) 8 is/are rejected under 35 U.S.C. 103 as being unpatentable over Becker ( US 10621727 B1 , 2020-04-14) in view of Desai ( US 11164108 B2 , 2021-11-02) , further in view of Selva, Bruno et al. (hereinafter Selva) ( US 11176443 B1 , 2021-11-16 ) . Regarding claim 8, Becker in view of Desai teach; The method of claim 2 (Using the same reasoning from claim 2) Becker fails to teach but Desai teaches; Second machine learning model (Using the same reasoning as claim 2) Using the reasoning from claim 2, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, for the second machine learning model taught by Desai to be the model taught by Becker. Becker in view of Desai fail to teach but Selva teaches; wherein the second machine learning model is further generated based on information associated with the set of field images, the information associated with the set of field images including at least one of a field image creation date, a field image edit date, a field image format, a field image dimension, a field image file format, a field image length, a field image word count, or a field image character count. ([col.9 lines 41-45] (54) Segmentation at 422 is a common step in document digitization. It decomposes a document image into a large number of disjoint sub-images, such as the text delineated by rectangles as shown in FIG. 5, such that each sub-image contains at most a few words worth of text in a single line. ) NOTE: Teaches the image segments being field images (the image segments of fig. 5 represent fields of invoice such as the date, balance due, etc.) ([col. 8 lines 62-67, col.9 lines 1-20] In image pre-processing 410, the image portion of the text segments 402 and 404 are scaled and centered 412. Conventional deep learning algorithms have a fixed input dimension, for example, a 512×64 field-of-view. On the other hand, segment images in the text segments 402, 404 can have a wide range of dimensions. Rather than requiring the deep learning system 103 to learn a wide range of sizes (i.e., train the neural net for “size invariance”), embodiments scale the input images to the largest size that fits in the field of view of the selected deep learning algorithm. For example, if a segment is 76×21 it is upscaled at 411 by a factor of 3 (using a conventional pixel replication algorithm in one embodiment) to produce a larger segment of size 228×63. Furthermore, rather than pasting the 228×63 segment arbitrarily anywhere within the 512×64 field-of-view (i.e., train the neural net for “translation invariance”) it is centered at 412, by for example, pasting the 228×63 segment at the center of 512×64 palette, pre-initialized with white pixels. The pre-processing performed by modules 406 and 410, operate to restrict an arbitrary segment to format into a fixed normative input for the deep learning system 103, thereby avoiding the need for the system 103 to learn “size invariance” or “translation invariance”. This is in contrary to current approaches in training deep nets for computer vision problems and helps in improving accuracy of the model employed by system 100. ) NOTE: Teaches that the machine learning model (a deep learning system is considered a machine learning model) is further generated (training /learning for a machine learning model is considered generating, because training /learning is the act that produces the operative trained version of the model) based on information associated with the set of field images (the aforementioned segment images) , the information associated with the set of field images including at least a field image dimension ( T he image segments are scaled and centered based on their dimensions to produce normalized inputs for the deep learning system during training. From this, the training inputs from which the model is generated are formed based on the image dimensions ) OBVIOUSNESS TO COMBINE SELVA WITH BECKER AND DESAI: Selva is analogous art to Becker and Desai and the present disclosure as they all pertain to machine learning. Specifically, Selva pertains to an image text recognition system utilizing deep learning and image segmentation. Selva also states; ([col. 9 lines 13-20] The pre-processing performed by modules 406 and 410, operate to restrict an arbitrary segment to format into a fixed normative input for the deep learning system 103, thereby avoiding the need for the system 103 to learn “size invariance” or “translation invariance”. This is in contrary to current approaches in training deep nets for computer vision problems and helps in improving accuracy of the model employed by system 100. ) NOTE: This excerpt discloses that the method for normalizing the input dimensions of the field images to generate the machine learning model taught by Selva avoids the need for the model to learn “size invariance” or “translation invariance” , thereby improving the model accuracy. Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to generate the claimed second model (as taught by Becker in view of Desai) based on the dimensions of the field images (as taught by Selva) to improve the accuracy of the model. CONCLUSION Any inquiry concerning this communication or earlier communications from the examiner should be directed to Matthew Alan Cady whose telephone number is (571) 272-7229 . The examiner can normally be reached Monday - Friday, 7: 30 am - 5 :00 pm ET. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Cesar Paula can be reached on (571)272-4128. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. / MATTHEW ALAN CADY / Examiner, Art Unit 2145 /CESAR B PAULA/ Supervisory Patent Examiner, Art Unit 2145
Read full office action

Prosecution Timeline

Jul 25, 2023
Application Filed
Mar 27, 2026
Non-Final Rejection — §103 (current)

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

1-2
Expected OA Rounds
Grant Probability
3y 3m
Median Time to Grant
Low
PTA Risk
Based on 0 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month