DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 112(a)
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.
The following is a quotation of the first paragraph of pre-AIA 35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.
Claims 5, 12 and 19 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA 35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.
In claim 5 and analogous claims 12 and 19, the recited process/algorithm “structured clustering” is not defined by the claim and they are not a term of art. The specification does not provide a disclosure of the aforementioned algorithm in sufficient detail to demonstrate to one of ordinary skill in the art that the inventor possessed the invention. See MPEP 2161.01(I)
In light of the specification, Examiner suggests “structured clustering” should read as “supervised clustering.” (See para. [0014], “In some embodiments, supervised clustering may include providing names for the clusters corresponding to the primary or initial attribute, such as 0 wheels, 1 wheeled, 2 wheeled, 3 wheeled, etc.”)
Claim Rejections - 35 USC § 112(b)
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Claim 1 and analogous claims 8 and 15 recites “each bounding box indicating a location of one of a plurality of words arranged on a first document.” It is unclear if “one of a plurality of words” means “one word of a plurality of words” or “one plurality of words of a plurality of words.”
For examination purposes, Examiner interprets “one of a plurality of words” to mean “one word of a plurality of words.”
Claims 1-7, 9-14 and 15-20 are further rejected on virtue of their dependences to the base claims.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
In regards to claim 1,
Step 1: This part of the eligibility analysis evaluates whether the claim(s) falls within any statutory category. See MPEP 2106.03:
The claim directs to a statutory category – process.
Step 2A – Prong 1: Judicial Exception Recited?
MPEP 2106.04(a)(2)(I) “Accordingly, the "mental processes" abstract idea grouping is defined as concepts performed in the human mind, and examples of mental processes include observations, evaluations, judgments, and opinions.”
Further, the MPEP recites “The courts do not distinguish between mental processes that are performed entirely in the human mind and mental processes that require a human to use a physical aid (e.g., pen and paper or a slide rule) to perform the claim limitation.”
Yes, the claim recites a mental process, specifically:
detecting a plurality of bounding boxes, each bounding box indicating a location of one of a plurality of words arranged on a first document
This limitation encompasses an evaluation of a first document and providing a judgement of boxes around a plurality of words. For example, with the aid of pen and the first document with words, one of ordinary skills in the art would be able to box words on the page.
identifying coordinates for each of the plurality of bounding boxes;
This limitation encompasses a further evaluation of the first document and providing an opinion of a coordinate system for each of the plurality of bounding boxes. For example, with the aid of pen and the first annotated document, one of ordinary skills in the art would be able to draw a cartesian coordinate system on the page and provide coordinates for each bounding box.
generating an adjacency matrix based on combining the key matrix and the query matrix
This limitation encompasses a further evaluation of the first annotated document and an observation of the outputs provided by the neural network to provide a judgement in the form of an adjacency matrix, which Examiner interprets to be a matrix indicating the relative positional relationships of the bounded words. Examiner interprets this claim in light of the specification and notes one of ordinary skills would be able to provide this adjacency matrix with the aid of pen and paper, (“[0043] FIG. 2 illustrates an example adjacency matrix 226, according to some embodiments. Document 206 includes example text 108 that includes various bounding boxes 118 as identified by OCR 116. As illustrated, the corresponding adjacency matrix 226 (which may be an example of adjacency matrix 126) includes 6 vertical lines and 6 horizontal lines, corresponding to the 6 identified bounding boxes 118 or words of document 206.
[0044] The first line w1 may correspond to the first word “Hello” the second line w2 may correspond to the second word as indicated by the second bounding box “World!”, the third line w3 may correspond to the third word “This”, and so on. As can be seen in adjacency matrix 226, the intersections of w1 (Hello) and w2 (World) both include 1 values indicating they belong on the same line.”)
PNG
media_image1.png
316
658
media_image1.png
Greyscale
clustering the plurality of words into a plurality of clusters, wherein each cluster corresponds to a different line on the first document
This limitation encompasses an evaluation of the first document wherein a person of ordinary skills in the art would able to cluster words based on delimiting by line breaks.
generating a second document, wherein each of the plurality of words corresponding to a respective cluster of the plurality of clusters is arranged on a same line on the second document
This limitation encompasses further evaluation wherein a person of ordinary skills in the art would be able to essentially replicate the first document based on the clusters of words with the same line breaks with the aid of pen and paper.
Therefore, the claim recites a mental process.
Step 2A – Prong 2: Integrated into a Practical Solution?
MPEP 2106.05(f) Mere Instructions To Apply An Exception has found simply adding a general purpose computer or computer components after the fact to an abstract idea (e.g., a fundamental economic practice or mathematical equation) does not integrate a judicial exception into a practical application or provide significantly more. The following steps are mere instructions to apply:
generating, by a neural network, a key matrix and a query matrix based on the coordinates
(the BRI of the limitation encompasses executing a generic neural network on given coordinate data to provide outputs)
MPEP 2106.05(g) Insignificant Extra-Solution Activity has found post solution activity to be insignificant extra-solution activity. The following steps are insignificant extra-solution activities:
Post solution activity:
and providing the second document comprising the plurality of words arranged across a plurality of different lines, in accordance with the plurality of clusters, for display
(The BRI of this limitation encompasses displaying the evaluated data)
The additional elements have been considered both individually and as an ordered combination in to determine whether they integrate the exception into a practical application. Therefore, no meaningful limits are imposed on practicing the abstract idea.
The claim is directed to the abstract idea.
Step 2B: Claim provides an Inventive Concept?
No, as discussed with respect to Step 2A, the additional limitation is mere data gathering/post solution activity (Insignificant Extra-Solution Activity) and a generic device do not impose any meaningful limits on practicing the abstract idea and therefore the claim does not provide an inventive concept in Step 2B.
The claim recites transmitting data by generic device.
This has been determined to be insignificant extra-solution activity as found in MPEP § 2106.05(d)(II)(i): Receiving or transmitting data over a network, e.g., using the Internet to
gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362 (utilizing an intermediary
computer to forward information); TLI Communications LLC v. AV Auto. LLC, 823 F.3d 607,
610, 118 USPQ2d 1744, 1745 (Fed. Cir. 2016) (using a telephone for image transmission); OIP
Techs., Inc., v. Amazon.com, Inc., 788 F.3d 1359, 1363, 115 USPQ2d 1090, 1093 (Fed. Cir. 2015)
(sending messages over a network); buy SAFE, Inc. v. Google, Inc., 765 F.3d 1350, 1355, 112
USPQ2d 1093, 1096 (Fed. Cir. 2014) (computer receives and sends information over a network);
but see DDR Holdings, LLC v. Hotels.com, L.P., 773 F.3d 1245, 1258, 113 USPQ2d 1097, 1106
(Fed. Cir. 2014) ("Unlike the claims in Ultramercial, the claims at issue here specify how
interactions with the Internet are manipulated to yield a desired result‐‐a result that overrides
the routine and conventional sequence of events ordinarily triggered by the click of a hyperlink."
(emphasis added)).
The additional elements have been considered both individually and as an ordered
combination in the significantly more consideration.
The claim is ineligible.
In regards to claim 2,
Step 1: This part of the eligibility analysis evaluates whether the claim(s) falls within any statutory category. See MPEP 2106.03:
The claim directs to a statutory category – process.
Step 2A Prong 1: The claim recites the following abstract ideas:
The abstract idea(s) in the parent claim(s).
wherein the adjacency matrix includes a numerical value for each word.
This limitation directs to a mental process that can be performed in the human mind, by a human using pen and paper, or using a computer as a tool to perform the concept and encompasses providing a numerical value (opinion) for each word. See MPEP 2106.04(a)(2)(III)
Step 2A Prong 2: The claim recites the following additional elements which, considered individually and as an ordered combination, do not integrate the abstract idea into a practical application:
The additional element(s) in the parent claim(s).
Step 2B: The claim recites the following additional elements which, considered individually and as an ordered combination, do not amount to significantly more than the abstract idea:
The additional element(s) in the parent claim(s).
In regards to claim 3,
Step 1: This part of the eligibility analysis evaluates whether the claim(s) falls within any statutory category. See MPEP 2106.03:
The claim directs to a statutory category – process.
Step 2A Prong 1: The claim recites the following abstract ideas:
The abstract idea(s) in the parent claim(s).
wherein the numerical value is between zero and one.
This limitation directs to a mental process that can be performed in the human mind, by a human using pen and paper, or using a computer as a tool to perform the concept and encompasses providing a numerical value between zero and one (opinion) for each word. See MPEP 2106.04(a)(2)(III)
Step 2A Prong 2: The claim recites the following additional elements which, considered individually and as an ordered combination, do not integrate the abstract idea into a practical application:
The additional element(s) in the parent claim(s).
Step 2B: The claim recites the following additional elements which, considered individually and as an ordered combination, do not amount to significantly more than the abstract idea:
The additional element(s) in the parent claim(s).
In regards to claim 4,
Step 1: This part of the eligibility analysis evaluates whether the claim(s) falls within any statutory category. See MPEP 2106.03:
The claim directs to a statutory category – process.
Step 2A Prong 1: The claim recites the following abstract ideas:
The abstract idea(s) in the parent claim(s).
wherein the generating the adjacency matrix comprises multiplying the key matrix and the query matrix, wherein the adjacency matrix is a product of the multiplying.
This limitation directs to a mathematical calculation and encompasses matrix multiplication. See MPEP 2106.04(a)(2)(I)(C.)
Step 2A Prong 2: The claim recites the following additional elements which, considered individually and as an ordered combination, do not integrate the abstract idea into a practical application:
The additional element(s) in the parent claim(s).
Step 2B: The claim recites the following additional elements which, considered individually and as an ordered combination, do not amount to significantly more than the abstract idea:
The additional element(s) in the parent claim(s).
In regards to claim 5,
Step 1: This part of the eligibility analysis evaluates whether the claim(s) falls within any statutory category. See MPEP 2106.03:
The claim directs to a statutory category – process.
Step 2A Prong 1: The claim recites the following abstract ideas:
The abstract idea(s) in the parent claim(s).
wherein the clustering comprises structured clustering using names for the clusters as provided during a training of the neural network
This limitation directs to a mental process that can be performed in the human mind, by a human using pen and paper, or using a computer as a tool to perform the concept and encompasses providing an opinion of a name for each cluster during a training of the neural network (which can just be executing in the background under the claim’s BRI as the clustering is neither obtained from the neural network or performed by the neural network.) For example, after evaluating the first document and obtaining 5 clusters, one of ordinary skills in the art would be able to name the 5 clusters as Bruce, Richard, Damian, Todd, and Tim. See MPEP 2106.04(a)(2)(III)
Step 2A Prong 2: The claim recites the following additional elements which, considered individually and as an ordered combination, do not integrate the abstract idea into a practical application:
The additional element(s) in the parent claim(s).
Step 2B: The claim recites the following additional elements which, considered individually and as an ordered combination, do not amount to significantly more than the abstract idea:
The additional element(s) in the parent claim(s).
In regards to claim 6,
Step 1: This part of the eligibility analysis evaluates whether the claim(s) falls within any statutory category. See MPEP 2106.03:
The claim directs to a statutory category – process.
Step 2A Prong 1: The claim recites the following abstract ideas:
The abstract idea(s) in the parent claim(s).
wherein the plurality of different lines on the second document correspond to a plurality of different lines on the first document
This limitation directs to a mental process that can be performed in the human mind, by a human using pen and paper, or using a computer as a tool to perform the concept and encompasses an evaluation of the first and second document to ensure the lines breaks are the same. See MPEP 2106.04(a)(2)(III)
Step 2A Prong 2: The claim recites the following additional elements which, considered individually and as an ordered combination, do not integrate the abstract idea into a practical application:
The additional element(s) in the parent claim(s).
Step 2B: The claim recites the following additional elements which, considered individually and as an ordered combination, do not amount to significantly more than the abstract idea:
The additional element(s) in the parent claim(s).
In regards to claim 7,
Step 1: This part of the eligibility analysis evaluates whether the claim(s) falls within any statutory category. See MPEP 2106.03:
The claim directs to a statutory category – process.
Step 2A Prong 1: The claim recites the following abstract ideas:
The abstract idea(s) in the parent claim(s).
identifying a first word and a second word, of the plurality of words, belonging to the same line based on the adjacency matrix
This limitation directs to a mental process that can be performed in the human mind, by a human using pen and paper, or using a computer as a tool to perform the concept an encompasses an evaluation of the adjacency matrix to provide opinions of a first and second word that belongs to the same line. See MPEP 2106.04(a)(2)(III)
performing a vertical extension check on the first word with the second word, wherein the vertical extension check comprises determining whether there is a vertical intersection between the first word and the second word indicating that the first word and the second word are on different lines
This limitation directs to a mental process that can be performed in the human mind, by a human using pen and paper, or using a computer as a tool to perform the concept an encompasses an evaluation of the first and second word to provide a judgement on whether or not the first and second words are on different lines. Examiner interprets this claim in light of the specification and notes one of ordinary skills in the art is capable of evaluating the provided adjacency matrix for said vertical intersection, (“[0044] The first line w1 may correspond to the first word “Hello” the second line w2 may correspond to the second word as indicated by the second bounding box “World!”, the third line w3 may correspond to the third word “This”, and so on. As can be seen in adjacency matrix 226, the intersections of w1 (Hello) and w2 (World) both include 1 values indicating they belong on the same line.”) See MPEP 2106.04(a)(2)(III)
Step 2A Prong 2: The claim recites the following additional elements which, considered individually and as an ordered combination, do not integrate the abstract idea into a practical application:
The additional element(s) in the parent claim(s).
Step 2B: The claim recites the following additional elements which, considered individually and as an ordered combination, do not amount to significantly more than the abstract idea:
The additional element(s) in the parent claim(s).
In regards to claim 8,
Step 1: This part of the eligibility analysis evaluates whether the claim(s) falls within any statutory category. See MPEP 2106.03:
The claim directs to a statutory category – machine.
Step 2A Prong 1: The claim recites the following abstract ideas:
The abstract idea(s) in analogous claim 1.
Step 2A Prong 2: The claim recites the following additional elements which, considered individually and as an ordered combination, do not integrate the abstract idea into a practical application:
The additional element(s) in analogous claim 1.
A system comprising at least one processor, the at least one processor configured to perform operations
This limitation directs to merely applying (or equivalent) an abstract idea, or implementing an abstract idea on a computer, or using a computer as a tool to perform an abstract idea. See MPEP 2106.05(f)
Step 2B: The claim recites the following additional elements which, considered individually and as an ordered combination, do not amount to significantly more than the abstract idea:
The additional element(s) in analogous claim 1.
The remaining steps of claim 8 has been considered individually and does not amount
to significantly more under the same rationale as the analogous steps of claim 1.
Additionally, the recited steps of claim 8 and analogous steps of claim 1 have been
considered as an ordered combination to determine whether they integrate the exception into
a practical application.
Thus, examiner has determined these additional elements does not integrate the judicial exception into a practical application and does not amount to significantly more.
The claim is ineligible.
Claim 15 (machine) is rejected on the same grounds under 35 U.S.C. 101 as claim 8 as they are substantially similar, respectively, Mutatis mutandis.
Claims 9 and 16 are rejected on the same grounds under 35 U.S.C. 101 as claim 2 as they are substantially similar, respectively, Mutatis mutandis.
Claims 10 and 17 are rejected on the same grounds under 35 U.S.C. 101 as claim 3 as they are substantially similar, respectively, Mutatis mutandis.
Claims 11 and 18 are rejected on the same grounds under 35 U.S.C. 101 as claim 4 as they are substantially similar, respectively, Mutatis mutandis.
Claims 12 and 19 are rejected on the same grounds under 35 U.S.C. 101 as claim 5 as they are substantially similar, respectively, Mutatis mutandis.
Claims 13 and 20 are rejected on the same grounds under 35 U.S.C. 101 as claim 6 as they are substantially similar, respectively, Mutatis mutandis.
Claims 14 is rejected on the same grounds under 35 U.S.C. 101 as claim 7 as they are substantially similar, respectively, Mutatis mutandis.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over US Pat no. US11625930B2 Rodriguez et al. (“Rodriguez”) in view of CN Pub No. CN112668566A Gao et al. (“Gao”)
In regards to claim 1,
Rodriguez teaches A method, comprising:
(Rodriguez, Col. 21 lines 6-18, “From the foregoing, it will be appreciated that example systems, methods, apparatus, and articles of manufacture have been disclosed that detect lines from OCR text. The disclosed systems, methods, apparatus, and articles of manufacture improve the efficiency of using a computing device by improving the accuracy of computer vision and reducing errors in line detection in media such as receipts with gaps in between words. The disclosed systems, methods, apparatus, and articles of manufacture are accordingly directed to one or more improvement(s) in the operation of a machine such as a computer or other electronic and/or mechanical device.”)
Rodriguez teaches detecting a plurality of bounding boxes, each bounding box indicating a location of one of a plurality of words arranged on a first document;
(Rodriguez, Col. 5 lines 52-56, “FIG. 6 is an example of computer syntax of example target feature data 600. For example, the OCR circuitry 410 of FIG. 4 may collect positional information of text boxes [detecting a plurality of bounding boxes ie text boxes, each bounding box indicating a location ie positional information of one ie one word of a plurality of words arranged on a first document], which may be transmitted (e.g., communicated, sent) to the example data interface circuitry 402 of FIG. 4 .”)
PNG
media_image2.png
461
647
media_image2.png
Greyscale
Rodriguez teaches identifying coordinates for each of the plurality of bounding boxes;
(Rodriguez, Col. 6 lines 55-64, “FIG. 7 is an example word 730 illustrating the different positional features selected by the example vertex representation circuitry 302. For example, the word 730 (e.g., “QUICK”) has a left center point 732. The left center point includes an x-coordinate of negative four (“−4”) and a y-coordinate of positive two (“2”). The right center point 736 includes an x-coordinate of positive five (“5”) and a y-coordinate of positive three (“3”). The example word includes a y-intersect point 734 at a y-coordinate of two point forty-four (“2.44”).”; see also figure 6)
PNG
media_image3.png
533
540
media_image3.png
Greyscale
Rodriguez teaches generating an adjacency matrix [based on combining the key matrix and the query matrix;]
(Rodriguez, Col. 7 lines 18-23, “The feature graph is used as an input for the graph neural network circuitry 304 which generates an adjacency matrix [generating an adjacency matrix; wherein the adjacency matrix is generated from a feature graph (which are extracted features in a graph format; see flow chart of fig. 13)] that represents the connections between words that belong to the same line. For example, words with an index i and index j, A[i,j]=1 if those words belong to the same line and A[i,j]=0 if otherwise.”)
PNG
media_image4.png
358
803
media_image4.png
Greyscale
Rodriguez teaches clustering the plurality of words into a plurality of clusters, wherein each cluster corresponds to a different line on the first document;
(Rodriguez, Col. 7 line 60-Col. 8 line 20, “FIG. 11 illustrates example operation/functionality of the example clique assembler circuitry 408 in line construction. The first word “THE” 1102 (in the black outlined box) is the reference word. The example clique assembler circuitry 408 determines there is a clique [clustering the plurality of words into a plurality of clusters ie clique] between the first word “THE” 1102 and the second word “QUICK” 1104 and the third word “FOX” 1106 (both in the dashed outlined boxes). The first word 1102, the second word 1104, and the third word 1106 are in a clique because there is a double connection between all three words. There is a first connection 1122 a from the first word “THE” 1102 to the second word “QUICK” 1104. There is a second connection 1122 b from the second work “QUICK” 1104 to the first word “THE” 1102. There is a second connection pair (1124 a and 1124 b) between the second word “QUICK” 1104 and the third word “FOX” 1106. There is a third connection pair (1120 a and 1120 b) between the first word 1102 and the third word 1106. As used herein, a double connection is wherein index i and index j, A[i,j]=1 if those words belong to the same line [wherein each cluster corresponds to a different line on the first document] and A[i,j]=0 if otherwise.
Despite the fourth word “JUMPED” 1108 (in the dotted outline box) having double connections with the first word 1102, the second word 1104, the third word 1106, the fourth word 1108 is not included in the first clique. The clique assembler circuitry 406 operates according to the rule that a word that is below one of the words in the clique (e.g., line) is unable to be added to the left or right of the clique (e.g., line).”)
Rodriguez teaches generating a second document, wherein each of the plurality of words corresponding to a respective cluster of the plurality of clusters is arranged on a same line on the second document;
(Rodriguez, Col. 9 lines 55-57, “In some examples, the line detection framework circuitry 400 includes means for outputting lines of text based on the cliques of OCR words.”; wherein the cliques are generated based on making sure a word is on the same line)
(Rodriguez, Col. 8 lines 16-20, “The clique assembler circuitry 406 operates according to the rule that a word that is below one of the words in the clique (e.g., line) is unable to be added to the left or right of the clique (e.g., line).”)
Rodriguez teaches and providing the second document comprising the plurality of words arranged across a plurality of different lines, in accordance with the plurality of clusters, for display.
(Rodriguez, Col. 16 lines 12-22, “One or more output devices 1924 are also connected to the interface circuitry 1920 of the illustrated example. The output devices 1924 can be implemented, for example, by display devices [providing the second document comprising the plurality of words arranged across a plurality of different lines, in accordance with the plurality of clusters, for display; Rodriguez teaches an interface for display] (e.g., a light emitting diode (LED), an organic light emitting diode (OLED), a liquid crystal display (LCD), a cathode ray tube (CRT) display, an in-place switching (IPS) display, a touchscreen, etc.), a tactile output device, a printer, and/or speaker. The interface circuitry 1920 of the illustrated example, thus, typically includes a graphics driver card, a graphics driver chip, and/or graphics processor circuitry such as a GPU.”)
However, Rodriguez does not explicitly teach generating, by a neural network, a key matrix and a query matrix based on the coordinates; [generating an adjacency matrix] based on combining the key matrix and the query matrix
Gao teaches generating, by a neural network, a key matrix and a query matrix based on the coordinates;
(Gao, Detailed Ways, “In the specific implementation, the electronic device can shoot the first table, obtaining the table image of the first table, and identifying the table image, mainly using OCR technology to obtain the identification result, the identification result can include text box content and content information, further; and performing feature extraction to the identification result to obtain the first feature vector, wherein the feature extraction feature can be at least one of the following: vertex coordinate of text box [based on the coordinates; wherein the coordinates are provided by Rodriguez], central coordinate, width, height, color, text character type statistics, text word vector, sentence vector, background color, font, texture and so on, which is not limited. The algorithm corresponding to the feature extraction can be at least one of the following: harris angular point detection, scale invariant feature transformation algorithm, neural network algorithm [by a neural network]; wavelet transform and so on, which is not limited, so as to represent each feature in the form of a vector, obtaining the first feature vector, namely the first feature vector can include the feature of the text box dimension of the table, also can include the feature of the content dimension of the table…
103. Splicing the first feature vector and the second feature vector to obtain a third feature vector.
In a specific implementation, the electronic device may process the dimensions of the first feature vector and the second feature vector into the same dimension in at least one dimension, and then splicing the two to obtain the third feature vector.
In specific implementations, eg. The electronic device may splicing the first feature vector and the second feature vector to obtain a third feature vector, such as feature splicing, to obtain an N×D feature vector E [generating… a key matrix and a query matrix], where N=Nt +Nl.”)
Gao teaches generating an adjacency matrix based on combining the key matrix and the query matrix
(Gao, Detailed Ways, “In the specific implementation, the electronic device can input the feature vector E to the model, the feature of the vertex represents X = (E), the shape of X is N * D. based on the vertex feature X. Further, it can obtain the relation adjacency matrix A [generating an adjacency matrix] = XWXT by bilinear multiplication [combining the key matrix and the query matrix; wherein bilinear multiplication is matrix multiplication of matrixes provided by the feature vector], the shape of W can be R * D * D, the shape of A can be R * N * N, wherein R is the relation number. The relationship predicted herein may include 3 (R= 3): two vertices are in the same table, the same row, the same column, which can respectively by A0, A1, A2. the target relation adjacency matrix Agt can be manually marked to input the vertex feature vector E to obtain the predicted relation adjacency matrix A. The invention further claims a method for obtaining the prediction relation adjacency matrix A.”)
Rodriguez and Gao are both considered to be analogous to the claimed invention because they are in the same field of text recognition and OCR. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Rodriguez to incorporate the teachings of Gao in order to provide a neural network algorithm and iterative optimization to predict adjacency matrix in order to improve the precision of text recognition (Gao, Abstract, “By adopting the embodiment of the application, the form can be restored to the precision.”)
In regards to claim 2,
Rodriguez and Gao teach The method of claim 1,
Rodriguez teaches wherein the adjacency matrix includes a numerical value for each word.
(Rodriguez, Col. 7 lines 18-23, “The feature graph is used as an input for the graph neural network circuitry 304 which generates an adjacency matrix that represents the connections between words that belong to the same line. For example, words with an index i and index j, A[i,j]=1 if those words belong to the same line and A[i,j]=0 if otherwise.”)
In regards to claim 3,
Rodriguez and Gao teach The method of claim 2,
Rodriguez teaches wherein the numerical value is between zero and one.
(Rodriguez, Col. 7 lines 18-23, “The feature graph is used as an input for the graph neural network circuitry 304 which generates an adjacency matrix that represents the connections between words that belong to the same line. For example, words with an index i and index j, A[i,j]=1 if those words belong to the same line and A[i,j]=0 if otherwise.”)
In regards to claim 4,
Rodriguez and Gao teach The method of claim 1,
Gao teaches wherein the generating the adjacency matrix comprises multiplying the key matrix and the query matrix, wherein the adjacency matrix is a product of the multiplying.
(Gao, Detailed Ways, “In the specific implementation, the electronic device can input the feature vector E to the model, the feature of the vertex represents X = (E), the shape of X is N * D. based on the vertex feature X. Further, it can obtain the relation adjacency matrix A [generating an adjacency matrix] = XWXT by bilinear multiplication [multiplying the key matrix and the query matrix, wherein the adjacency matrix is a product of the multiplying; wherein bilinear multiplication is matrix multiplication of matrixes provided by the feature vector], the shape of W can be R * D * D, the shape of A can be R * N * N, wherein R is the relation number. The relationship predicted herein may include 3 (R= 3): two vertices are in the same table, the same row, the same column, which can respectively by A0, A1, A2. the target relation adjacency matrix Agt can be manually marked to input the vertex feature vector E to obtain the predicted relation adjacency matrix A. The invention further claims a method for obtaining the prediction relation adjacency matrix A.”)
In regards to claim 5,
Rodriguez and Gao teach The method of claim 1,
Rodriguez teaches wherein the clustering comprises structured clustering using names for the clusters as provided during a training of the neural network.
(Rodriguez, Col. 8 lines 13-20, “Despite the fourth word “JUMPED” 1108 (in the dotted outline box) having double connections with the first word 1102, the second word 1104, the third word 1106, the fourth word 1108 is not included in the first clique [clustering comprises structured clustering using names for the clusters as provided; ie “first clique” is the name of a cluster but it can be any arbitrary name]. The clique assembler circuitry 406 operates according to the rule that a word that is below one of the words in the clique (e.g., line) is unable to be added to the left or right of the clique (e.g., line).”)
(Rodriguez, Col. 2 lines 39-45, “Unless specifically stated otherwise, descriptors such as “first,” “second,” “third,” etc., are used herein without imputing or otherwise indicating any meaning of priority, physical order, arrangement in a list, and/or ordering in any way, but are merely used as labels and/or arbitrary names to distinguish elements for ease of understanding the disclosed examples.”)
However, Rodriguez does not explicitly teach during a training of the neural network
Gao teaches during a training of the neural network
(Gao, Detailed Ways, “The prediction result A is compared with the labeled answer Agt, and the model parameters can also be optimized using gradient descent. Iterative optimization continues until the model training is complete.”; wherein at each training iteration, the combined system of Rodriguez and Gao can create the adjacency matrix which the cliques are generated from, then Gao teaches iterative optimization; which is performing the same steps again (thus training of the neural network))
In regards to claim 6,
Rodriguez and Gao teach The method of claim 1,
Rodriguez teaches wherein the plurality of different lines on the second document correspond to a plurality of different lines on the first document.
(Rodriguez, Col. 4 lines 21-41, “FIG. 3 illustrates an example line detection framework 300 for decoding text (e.g., receipts, media with large gaps, media having distortions, etc.). The example line detection framework 300 includes at least three operations performed by circuitry: vertex feature representation circuitry 302, graph neural network circuitry 304, and post processing circuitry 306. Example input 308 for the vertex feature representation operation (e.g., step) (performed by the vertex feature representation circuitry 302) is OCRed text boxes. The vertex feature representation circuitry 302 uses the example OCRed text boxes 308 as an input, and produces an example feature graph 310 as an output. The example graph neural network circuitry 304 uses the example feature graph 310 as an input and generates an example adjacency matrix 312. The example post processing circuitry 306 uses the example adjacency matrix 312 as an input and generates text lines 314 having a corrected alignment with other text (e.g., words, corresponding numbers, etc.) that is relevant to a particular row [different lines on the second document correspond to a plurality of different lines on the first document; ie the row corresponds between the source document and the generated text lines] (e.g., a row on a receipt showing an item description, a corresponding item quantity, a corresponding item price, etc.)”)
In regards to claim 7,
Rodriguez and Gao teach The method of claim 1,
Rodriguez teaches further comprising: identifying a first word and a second word, of the plurality of words, belonging to the same line based on the adjacency matrix; and performing a vertical extension check on the first word with the second word, wherein the vertical extension check comprises determining whether there is a vertical intersection between the first word and the second word indicating that the first word and the second word are on different lines.
(Rodriguez, Col. 7 lines 24-55, “FIG. 9 is the example text 900 to be decoded that corresponds to the adjacency matrix 1000 of FIG. 10 . The example text in FIG. 9 is a sentence, having three lines, reading “THE QUICK FOX JUMPED OVER ANOTHER LAZY DOG.” The first line 902 includes the first three words (e.g., “THE”, “QUICK,” “FOX”), the second line 904 includes the next two words (e.g., “JUMPED”, “OVER”), and the third line 906 includes the last three words (e.g., “ANOTHER,” “LAZY,” “DOG.”). The example text 900 may be the output of the example output circuitry 414, and is merely shown to illustrate the functionality of the adjacency matrix 1000 of FIG. 10 .
FIG. 10 is an example adjacency matrix 1000. The example graph neural network circuitry 304 (of FIG. 4 ) utilizes the adjacency matrix generation circuitry 404 (of FIG. 4 ) to generate the adjacency matrix 1000. The example adjacency matrix lists each word of the text 900 against all the other words of the text. For example, the word “THE” 1004 is adjacent to the word “QUICK” 1006 as shown by the one (“1”) [identifying a first word and a second word, of the plurality of words, belonging to the same line based on the adjacency matrix]. The word “THE” 1004 is also adjacent to the word “FOX” 1008 even though there is not a direct connection as shown by the one (“1”). The word “THE” 1004 is adjacent to the word “JUMPED” 1010, despite not being in the same horizontal line as shown by the one (“1”). The example post-processing circuitry 306 of FIG. 3 will address the issue that mere adjacency is not the same as being in the same line. In some examples, the word “THE” 1004 is not adjacent to the word “JUMPED” 1010 because the word “JUMPED” 1010 is not in the same horizontal line, and is shown by a zero (“1”). The word “THE” 1004 is not adjacent to the word “ANOTHER” 1012 as evidenced by the zero (“0”) [performing a vertical extension check ie checking for either 0 or 1 on the first word with the second word, wherein the vertical extension check comprises determining whether there is a vertical intersection between the first word and the second word indicating that the first word and the second word are on different lines].”)
Claims 8 and 15 are rejected on the same grounds under 35 U.S.C. 103 as claim 1.
Claims 9 and 16 are rejected on the same grounds under 35 U.S.C. 103 as claim 2.
Claims 10 and 17 are rejected on the same grounds under 35 U.S.C. 103 as claim 3.
Claims 11 and 18 are rejected on the same grounds under 35 U.S.C. 103 as claim 4.
Claims 12 and 19 are rejected on the same grounds under 35 U.S.C. 103 as claim 5.
Claims 13 and 20 are rejected on the same grounds under 35 U.S.C. 103 as claim 6.
Claims 14 is rejected on the same grounds under 35 U.S.C. 103 as claim 7.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
US Pub No. US20210034856A1 Torres et al. (“Torres”) teaches Region proposal networks for automated bounding box detection and text segmentation
US Pub No. US20080247674A1 Walch teaches Systems and methods for source language word pattern matching
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JASMINE THAI whose telephone number is (703)756-5904. The examiner can normally be reached M-F 8-4.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Michael Huntley can be reached at (303) 297-4307. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/J.T.T./Examiner, Art Unit 2129
/MICHAEL J HUNTLEY/Supervisory Patent Examiner, Art Unit 2129