DETAILED ACTION
This Non-Final replaces the Non-Final mailed on 12/03/2025
Reopening of Prosecution
37 C.F.R. 1.198 Reopening after a final decision of the Patent Trial and Appeal Board:
When a decision by the Patent Trial and Appeal Board on appeal has become final for judicial review, prosecution of the proceeding before the primary examiner will not be reopened or reconsidered by the primary examiner except under the provisions of § 1.114 or § 41.50 of this title without the written authority of the Director, and then only for the consideration of matters not already adjudicated, sufficient cause being shown.
By signing below the Director authorizes reopening prosecution for consideration of the following matters not already adjudicated by the Board.
Prosecution of the Instant Application is hereby reopened in accordance with MPEP 1214.07, for matters that have not already been adjudicated, given sufficient cause being shown below.
This action is made Non-Final.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.
Claim(s) 1 – 4, 7 – 12, 18, and 20 is/are rejected under 35 U.S.C. 102(a)(2) as being anticipated by ZOU et al (US 2020/0065563).
As to claims 1, ZOU et al teaches a processor system, comprising:
one or more circuits to adjust a number of inputs and/or a number of outputs of one or more nodes corresponding to one or more portions of a generated neural network portion based, at least in part, on whether the one or more nodes are likely to generate a valid result (paragraph [0049]… neurons are artificial neurons are elementary units in an artificial neural network. An artificial neuron receives one or more inputs and sums them to produce an output. This change advantageously speeds up the vector matching while still having a sufficiently high accuracy for facial recognition tasks. ; paragraph [0066]…Feature vectors are generated for the images after training. For example, in the context of a login or authentication type application, after training, for a given face image of a user who wants to login, a feature vector is generated by the convolutional network. The generated vector then is used as an input of the multi-task learning network, with the outputs being the results of all classifiers. In this example, the output is the gender and age group of the face image. In this example, the output is the gender and age group of the face image ; paragraph [0067]… consider that an N-way tensor T with shape (D.sub.1, D.sub.2 . . . D.sub.N) is an N-dimensional array containing Π.sub.n=1.sup.ND.sub.n elements. For example scalars, vectors, and matrices can be seen as 0, 1 and 2-way tensors. [0068] 1. The inputs are k N-way tensors with N>0 or, for example, two matrices with same shape (m, a, b). [0069] 2. Fix the first dimension and reshape the inputs to 2-way tensors. Examiner’s Note: “reshape inputs” reads on “adjust a number of inputs” (see paragraphs 31 and 68-74 of the instant application where a reshape operation is used to adjust the number of inputs/outputs); “results of all classifiers” reads on “generate a valid result”).
As to claims 2, ZOU et al shows and teaches the processor, wherein the one or more circuits are to generate one or more linear equations from a matrix of values that includes one or more parameters (paragraph [0067]…The parameters of the cross-stitch layer will be represented with a matrix A with shape (p, p). It is initialized with an identity matrix at the beginning of training. [0072] 5. Multiply the matrix X and A to apply linear transformation on inputs. The result is a matrix X′=XA with shape (m, p). [0073] 6. Reshape X′ and split it into k tensors with the same shapes as inputs and output them. ; Examiner’s Note: “linear transformation” reads on “linear equations” )
As to claims 3, ZOU et al shows and teaches the processor, wherein the one or more circuits are to receive a matrix that is generated based on defined constraints for:
a size of the matrix, and
a range for one or more of a matrix of values.
(paragraph [0067]…The parameters of the cross-stitch layer will be represented with a matrix A with shape (p, p). It is initialized with an identity matrix at the beginning of training. [0072] 5. Multiply the matrix X and A to apply linear transformation on inputs. The result is a matrix X′=XA with shape (m, p). [0073] 6. Reshape X′ and split it into k tensors with the same shapes as inputs and output them ; Examiner’s Note: “shape (m, p)” reads on “size of the matrix” ; “matrix X′=XA” reads on “a range of one or more of a matrix values” ; By definition, the range of a matrix is the set of all possible output vectors (or linear combinations) you can get by multiplying the matrix by any possible input vector).
As to claims 4, ZOU et al shows and teaches the processor, wherein the one or more circuits are to receive a matrix that defines a number of anchor nodes for the one or more portions of the generated neural network portion and connections between the anchor nodes (paragraph [0053]…Dropout is a regularization technique, which aims to reduce the complexity of the model with the goal to prevent overfitting. Using dropout, randomly certain units (neurons) in a layer are randomly deactivated with a certain probability ; paragraph [0064]… [0064] The multi-task learning classifier network 610 shows two classifiers, i.e., in the upper and lower rows. More particularly, one sub-network is built for each classifier, and each sub-network includes a four-layer fully-connected neural network in this example. Each classifier has three dense hidden layers 612a/612b/612c, 614a/614b/614c, and each hidden layer has 32 neurons. Dropout 616 is applied on the latter two hidden layers of each sub-network. The number of neurons and number of layers to which dropout may be applied are hyperparameters and, thus, may be different in other example embodiments. The last layer is the output layer 618a-618b, with the classifications (which in this example are female/male, and young/old). Examiner’s Note: The anchor nodes are the nodes left remaining after the dropout occurs and deactivates the non-anchor nodes)
As to claim 7, ZOU et al shows and teaches the processor, wherein the one or more circuits are to randomly generate the neural network portion (paragraph [0053]… Dropout is a regularization technique, which aims to reduce the complexity of the model with the goal to prevent overfitting. Using dropout, randomly certain units (neurons) in a layer are randomly deactivated with a certain probability).
As to claim 8, ZOU et al shows and teaches the processor, wherein the one or more circuits to are to solve a system of liner equations to indicate valid dimensions for the one or more nodes within the generated neural network portion. (paragraph [0067]…The parameters of the cross-stitch layer will be represented with a matrix A with shape (p, p). It is initialized with an identity matrix at the beginning of training. [0072] 5. Multiply the matrix X and A to apply linear transformation on inputs. The result is a matrix X′=XA with shape (m, p). [0073] 6. Reshape X′ and split it into k tensors with the same shapes as inputs and output them. ; Examiner’s Note: “linear transformation” reads on “linear equations” ; “shape (m, p)” reads on “valid dimensions”).
As to claim 9, ZOU et al shows and teaches the processor, wherein the one or circuits are to calculate, using the matrix as input, a number of rows and a number of columns of the matrix (paragraph [0067]… the inputs are k N-way tensors with N>0 or, for example, two matrices with same shape (m, a, b). [0069] 2. Fix the first dimension and reshape the inputs to 2-way tensors. With the example before, each matrix will have shape (m, a×b). [0070] 3. Concatenate these 2-way tensors along the second axis to get a matrix X with shape (m, p). With the example before, p=a×b+a×b. [0071] 4. The parameters of the cross-stitch layer will be represented with a matrix A with shape (p, p). It is initialized with an identity matrix at the beginning of training. [0072] 5. Multiply the matrix X and A to apply linear transformation on inputs. The result is a matrix X′=XA with shape (m, p). [0073] 6. Reshape X′ and split it into k tensors with the same shapes as inputs and output them ; Examiner’s Note: “shape (m, p)” reads on “number or rows and number of columns” ; By definition, a matrix is fundamentally a rectangular array of numbers (or other mathematical objects) organized into horizontal rows and vertical columns).
As to claims 10, ZOU et al teaches a non-transistory computer readable medium having stored thereon a set of instructions, which if performed by one or more processors (paragraph [0091]… least one processor execute instructions that may be tangibly stored on a non-transitory computer readable storage medium), cause the one or more processors to:
adjust a number of inputs and/or a number of outputs of one or more nodes corresponding to one or more portions of a generated neural network portion based, at least in part, on whether the one or more nodes are likely to generate a valid result (paragraph [0049]… neurons are artificial neurons are elementary units in an artificial neural network. An artificial neuron receives one or more inputs and sums them to produce an output. This change advantageously speeds up the vector matching while still having a sufficiently high accuracy for facial recognition tasks. ; paragraph [0066]…Feature vectors are generated for the images after training. For example, in the context of a login or authentication type application, after training, for a given face image of a user who wants to login, a feature vector is generated by the convolutional network. The generated vector then is used as an input of the multi-task learning network, with the outputs being the results of all classifiers. In this example, the output is the gender and age group of the face image.. In this example, the output is the gender and age group of the face image ; paragraph [0067]… consider that an N-way tensor T with shape (D.sub.1, D.sub.2 . . . D.sub.N) is an N-dimensional array containing Π.sub.n=1.sup.ND.sub.n elements. For example scalars, vectors, and matrices can be seen as 0, 1 and 2-way tensors. [0068] 1. The inputs are k N-way tensors with N>0 or, for example, two matrices with same shape (m, a, b). [0069] 2. Fix the first dimension and reshape the inputs to 2-way tensors. Examiner’s Note: “reshape inputs” reads on “adjust a number of inputs” ; “results of all classifiers” reads on “generate a valid result”).
As to claims 11, ZOU et al shows and teaches the non-transitory computer readable medium, wherein the set of instructions which if performed by the one or more processors, cause the one or more processors to receive as input a matrix of values including one or more parameters, wherein the matrix is generated based on defined constraints for:
a size of the matrix, and
a range for one or more of a matrix of values.
(paragraph [0067]…The parameters of the cross-stitch layer will be represented with a matrix A with shape (p, p). It is initialized with an identity matrix at the beginning of training. [0072] 5. Multiply the matrix X and A to apply linear transformation on inputs. The result is a matrix X′=XA with shape (m, p). [0073] 6. Reshape X′ and split it into k tensors with the same shapes as inputs and output them ; Examiner’s Note: “shape (m, p)” reads on “size of the matrix” ; “matrix X′=XA” reads on “a range of one or more of a matrix values” ; By definition, the range of a matrix is the set of all possible output vectors (or linear combinations) you can get by multiplying the matrix by any possible input vector).
As to claims 12, ZOU et al shows and teaches the non-transitory computer readable medium, wherein the set of instructions which if performed by the one or more processors, cause the one or more processors to receive as input a matrix of values including one or more parameters, wherein the matrix defines a number of anchor nodes for the one or more portions of the generated neural network portion and connections between the anchor nodes.(paragraph [0053]…Dropout is a regularization technique, which aims to reduce the complexity of the model with the goal to prevent overfitting. Using dropout, randomly certain units (neurons) in a layer are randomly deactivated with a certain probability ; paragraph [0064]… [0064] The multi-task learning classifier network 610 shows two classifiers, i.e., in the upper and lower rows. More particularly, one sub-network is built for each classifier, and each sub-network includes a four-layer fully-connected neural network in this example. Each classifier has three dense hidden layers 612a/612b/612c, 614a/614b/614c, and each hidden layer has 32 neurons. Dropout 616 is applied on the latter two hidden layers of each sub-network. The number of neurons and number of layers to which dropout may be applied are hyperparameters and, thus, may be different in other example embodiments. The last layer is the output layer 618a-618b, with the classifications (which in this example are female/male, and young/old). Examiner’s Note: The anchor nodes are the nodes left remaining after the dropout occurs and deactivates the non-anchor nodes)
As to claims 18, ZOU et al teaches a method comprising,
adjusting a number of inputs and/or a number of outputs of one or more nodes corresponding to one or more portions of a generated neural network portion based, at least in part, on whether the one or more nodes are likely to generate a valid result (paragraph [0049]… neurons are artificial neurons are elementary units in an artificial neural network. An artificial neuron receives one or more inputs and sums them to produce an output. This change advantageously speeds up the vector matching while still having a sufficiently high accuracy for facial recognition tasks. ; paragraph [0066]…Feature vectors are generated for the images after training. For example, in the context of a login or authentication type application, after training, for a given face image of a user who wants to login, a feature vector is generated by the convolutional network. The generated vector then is used as an input of the multi-task learning network, with the outputs being the results of all classifiers. In this example, the output is the gender and age group of the face image. In this example, the output is the gender and age group of the face image ; paragraph [0067]… consider that an N-way tensor T with shape (D.sub.1, D.sub.2 . . . D.sub.N) is an N-dimensional array containing Π.sub.n=1.sup.ND.sub.n elements. For example scalars, vectors, and matrices can be seen as 0, 1 and 2-way tensors. [0068] 1. The inputs are k N-way tensors with N>0 or, for example, two matrices with same shape (m, a, b). [0069] 2. Fix the first dimension and reshape the inputs to 2-way tensors. Examiner’s Note: “reshape inputs” reads on “adjust a number of inputs” ; “results of all classifiers” reads on “generate a valid result”).
As to claim 20, ZOU et al shows and teaches the method, further comprising:
receiving a matrix as input;
calculating the number of rows and the number of columns of the matrix that correspond to dimensions of inputs and outputs of the one or more nodes in the generated neural network portion.
(paragraph [0067]… the inputs are k N-way tensors with N>0 or, for example, two matrices with same shape (m, a, b). [0069] 2. Fix the first dimension and reshape the inputs to 2-way tensors. With the example before, each matrix will have shape (m, a×b). [0070] 3. Concatenate these 2-way tensors along the second axis to get a matrix X with shape (m, p). With the example before, p=a×b+a×b. [0071] 4. The parameters of the cross-stitch layer will be represented with a matrix A with shape (p, p). It is initialized with an identity matrix at the beginning of training. [0072] 5. Multiply the matrix X and A to apply linear transformation on inputs. The result is a matrix X′=XA with shape (m, p). [0073] 6. Reshape X′ and split it into k tensors with the same shapes as inputs and output them ; Examiner’s Note: “inputs are k N-way tensors reads on “matrix as input” ; “shape (m, p)” reads on “number or rows and number of columns” ; By definition, a matrix is fundamentally a rectangular array of numbers (or other mathematical objects) organized into horizontal rows and vertical columns).
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 4 is/are rejected under 35 U.S.C. 103 as being unpatentable over ZOU et al (US 2020/0065563) in view of Hung et al (US).
As to claim 4, ZOU et al teaches the one or more portions of the generated neural network portion is generated, using one or more linear equations.
ZOU et al fails to explicitly show/teach determining a volume of each anchor node of a plurality of anchor nodes in the one or more portions of the generated neural network portion, and determining dimensions for each of the anchor nodes, based on the volume of the anchor node.
However, Hung et al determining a volume of each anchor node of a plurality of anchor nodes in the one or more portions of the generated neural network portion, and determining dimensions for each of the anchor nodes, based on the volume of the anchor node (paragraph [0043]…teaches Depth of the output volume controls the number of neurons in the layer that connect to the same region of the input volume. All of these neurons will learn to activate for different features in the input. For example, if the first convolutional layer takes the raw image as input, then different neurons along the depth dimension may activate in the presence of various oriented edges, or blobs of color. Paragraph [0044]…Stride controls how depth columns around the spatial dimensions (width and height) are allocated. When the stride is 1, a new depth column of neurons is allocated to spatial positions only one spatial unit apart. This leads to heavily overlapping receptive fields between the columns, and also to large output volumes. Conversely, if higher strides are used then the receptive fields will overlap less and the resulting output volume will have smaller dimensions spatially. Paragraph [0045]…Sometimes it is convenient to pad the input with zeros on the border of the input volume. The size of this zero-padding is another hyper-parameter. Zero padding provides control of the output volume spatial size. In particular, sometimes it is desirable to exactly preserve the spatial size of the input volume ; Examiner’s Note: “the spatial size of the output volume can be computed as a function of the input volume size W” reads on “determining a volume of each anchor node” ; ”how many neurons fit in a given volume” reads on “anchor node” ; “Stride controls how depth columns around the spatial dimensions (width and height) are allocated” reads on “determining dimensions for each of the anchor nodes”).
Therefore, it would have been obvious for one having ordinary skill in the art, at the time the invention was made for ZOU et al, determining a volume of each anchor node of a plurality of anchor nodes in the one or more portions of the generated neural network portion, and determining dimensions for each of the anchor nodes, based on the volume of the anchor node, as in Hung et al, for the purpose of increasing efficiency of CNNs that require such large amounts of computation.
Claim(s) 15 and 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over ZOU et al (US 2020/0065563) in view of Aggarwal et al (US)
As to claim 15, ZOU et al teaches the non-transitory computer readable medium of, wherein the set of instructions which if performed by the one or more processors, cause the one or more processors to randomly generate neural network portion (paragraph [0053]… Dropout is a regularization technique, which aims to reduce the complexity of the model with the goal to prevent overfitting. Using dropout, randomly certain units (neurons) in a layer are randomly deactivated with a certain probability).
ZOU et al fails to explicitly show/teach solve a set of linear equations to randomly generate neural network portion
However, Aggarwal et al solve a set of linear equations to randomly generate neural network portion (paragraph [0046]…. or linear regression with regularization, the optimal ridge coefficient λ was selected by varying it between 1 and 1000 and selecting the parameter which gave the least RMS error in cross-validation. For Random Forests, n estimators between 1 and 100 were varied and maximum depth parameter was between 1 and 10. For support vector machines two kernels were tested: linear and radial basis function. In order to select the optimal SVM [Support Vector Machine] model, the penalty factor C, parameters γ and ε, the SVM kernel and the selected set of values were varied that gave the lowest RMS error in cross-validation. The Neural Networks model had one hidden layer and 5 to 10 neuron ; Examiner’s Note: By definition, linear regression is inherently random because it modes a relationship with an unpredictable error term)
Therefore, it would have been obvious for one having ordinary skill in the art, at the time the invention was made for ZOU et al, to solve a set of linear equations to randomly generate neural network portion, as in Aggarwal et al, for the purpose of simplifying and understanding variable relationships.
As to claim 19, ZOU et al teaches the method, wherein the one or more
portions of the generated neural network portion is generated by:
randomly generate the one or portions of the generated neural network portion, wherein solutions indicate the one or more nodes that are valid (paragraph [0053]… Dropout is a regularization technique, which aims to reduce the complexity of the model with the goal to prevent overfitting. Using dropout, randomly certain units (neurons) in a layer are randomly deactivated with a certain probability ; Examiner’s Note: “regularization technique, which aims to reduce the complexity of the model with the goal to prevent overfitting” reads on “solutions indicate the one or more nodes that are valid” ).
However, Aggarwal et al solve a set of linear equations to randomly generate neural network portion (paragraph [0046]…. or linear regression with regularization, the optimal ridge coefficient λ was selected by varying it between 1 and 1000 and selecting the parameter which gave the least RMS error in cross-validation. For Random Forests, n estimators between 1 and 100 were varied and maximum depth parameter was between 1 and 10. For support vector machines two kernels were tested: linear and radial basis function. In order to select the optimal SVM [Support Vector Machine] model, the penalty factor C, parameters γ and ε, the SVM kernel and the selected set of values were varied that gave the lowest RMS error in cross-validation. The Neural Networks model had one hidden layer and 5 to 10 neuron ; Examiner’s Note: By definition, linear regression is inherently random because it modes a relationship with an unpredictable error term)
Therefore, it would have been obvious for one having ordinary skill in the art, at the time the invention was made for ZOU et al, to solve a set of linear equations to randomly generate neural network portion, as in Aggarwal et al, for the purpose of simplifying and understanding variable relationships.
Claim(s) 16 s/are rejected under 35 U.S.C. 103 as being unpatentable over ZOU et al (US 2020/0065563) in view of Cheng et al (US 2017/0140283).
As to claim 16, ZOU et al teaches the one or more portions of the generated neural network portion
ZOU et al fails to show/teach determining a function for each connection of a plurality of connections between anchor nodes, based on dimensions of the anchor nodes linked by the connection.
However, Cheng et al teaches determining a function for each connection of a plurality of connections between anchor nodes, based on dimensions of the anchor nodes linked by the connection (paragraph [0050]… Training parameters, for example, can control the process of applying the training data to train the lookalike model, such as: a starting_weights parameter that can control starting weights for edges between nodes of the neural network functions, parameters to control how much weights and function parameters are adjusted for each training data sample, or parameters controlling a support vector machine such as a kernel to use kernel parameters, dimensions parameters, or soft margin parameters ; Examiner’s Note: “staring weights control” reads on function for each connection”).
Therefore, it would have been obvious for one having ordinary skill in the art, at the time the invention was made for ZOU et al, determining a function for each connection of a plurality of connections between anchor nodes, based on dimensions of the anchor nodes linked by the connection, as in Cheng et al, for the purpose of decreasing processing requirements.
Response to Arguments
Applicant’s arguments with respect to claim(s) 1 – 5, 7 – 13, and 15 - 20 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Allowable Subject Matter
Claims 6, 14, and 17 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BRANDON S COLE whose telephone number is (571)270-5075. The examiner can normally be reached Mon - Fri 7:30pm - 5pm EST (Alternate Friday's Off).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Omar Fernandez can be reached on 571-272-2589. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/BRANDON S COLE/ Primary Examiner, Art Unit 2128
/DAVID A WILEY/ Director, Art Unit 2100