Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant’s argument filed 01/13/2026 have been fully considered but they are not persuasive.
Applicant’s Argument: On page 12-14 of Applicant’s response to rejections under 35 U.S.C. 101, applicant states “Applicant submits that the claims are eligible under Step 2A, Prong 2 because various features of the claims, in fact, integrate any alleged abstract idea into a practical application— namely, the reduction in computational resources via early exiting, thereby enabling inferencing operations on complex data using a machine learning model on devices with limited computational resources. In particular, Applicant submits that this practical application reflects an improvement to other technology or a technical field, namely the field of machine learning and, more specifically, the technology of practically implementing machine learning models on physical hardware devices.
Likewise, here, the features of the present claims reflect an improvement to a technology or technical field. Specifically, like the claims found eligible in Ex parte Chari, the present claims recite specific techniques that provide a specific improvement to machine learning and practical implementation of machine learning models on physical devices. For example, the practical application improves the technical field of machine learning by using gating logic (e.g., a “first gate” as claimed) that “may be trained to allow such models to automatically determine the earliest point in processing where an inference is sufficiently reliable, and to then bypass additional processing.” Specification § [0023]. In doing so, “‘easier’ (e.g., less complex) input data are handled using earlier and thus fewer classifiers, and ‘harder’ (e.g., more complex) input data are handled using later and thus more classifiers.” By bypassing additional processing, computational resources are saved without sacrificing much accuracy, allowing these models to be deployed on lower power devices. Thus, the solutions discussed herein and reflected in the claims improve upon conventional techniques in which traditional model architectures “rely on the entire model to feed a single output layer,” which generally require “larger, computationally-intensive models” running on high power devices to work with complex data.”
Examiner’s Response: Applicant’s argument is not persuasive. The claimed improvement as recited in the claims is directed to an abstract idea of a mental process that can be performed in the human mind. The claim as a whole makes a determination of continuing a classification operation or exiting of the classification operation based on the complexity of the input data and the determination process can be entirely performed as a mental process.
Under Step 2A Prong 2 analysis, examiner determines whether additional elements integrate the abstract idea into a practical application and evaluate whether the steps which achieve the improvement are recited in the claim. The claim recites the following inventive concept, “making a determination by a first gate, based on a complexity of the first intermediate activation data, whether or not to exit processing by the classification model”. The first gate makes the determination whether to exit the computation based on a complexity factor. The claims as a whole are very broadly recited and there is not a lot of descriptive claim elements that can further limit the scope of the invention. For example, the term complexity is not very well defined in the specification and the recited claims does not further define the scope of which the term complexity is to be interpreted. Also, the claim does not disclose the details of what determination is made based on the complexity of the data. It is not clear whether the system exits processing when the data is complex or when the data is not complex. Similarly, it is not clear whether the system continues processing when the data is complex or when the data is not complex.
An important consideration in determining whether a claim improves technology is the extent to which the claim covers a particular solution to a problem or a particular way to achieve a desired outcome, as opposed to merely claiming the idea of a solution or outcome (see MPEP 2106.05(a)). The amended claims do not provide sufficient details to describe any technological improvement. If the specifications explicitly set forth an improvement but in a conclusory manner (see MPEP 2106.04(d)(1): a bare assertion of an improvement without the detail necessary to be apparent to a person of ordinary skill in the art), the examiner should not determine the claim improves technology.
During examination, the examiner should analyze the "improvements" consideration by evaluating the specification and the claims to ensure that a technical explanation of the asserted improvement is present in the specification, and that the claim reflects the asserted improvement (see MPEP §2106.05(a)). The MPEP (§2106.05(a)(II)) also warns, “it is important to keep in mind that an improvement in the abstract idea itself (e.g. a recited fundamental economic concept) is not an improvement in technology.” Here, the alleged improvement in the form of “technique for a model to early exit when the gate determines that additional processing is not necessary based on complexity of certain features” is an improvement to the abstract idea of a mental process that can be performed in the human mind.
Regarding Ex parte Chari, applicant’s argument is not persuasive as the argument does not disclose how the claims or improvement in Ex parte Chari is appliable to the instant claims or the invention. The instant claims does not recite specific techniques to provide a specific improvement to a computer system.
Applicant’s Argument: On page 13-15 of Applicant’s response to rejections under 35 U.S.C. 101, applicant argues in view of Ex parte Desjardins that the present claims provide an improvement to machine learning that integrates any alleged abstract idea into a practical application. Applicant states that the improvement in the present disclosure is apparent because the claims recite the improvements to machine learning and the claim features describe such improvements.
Examiner’s Response: Applicant’s argument is not persuasive. Applicant merely states that the claimed invention and Desjardins are similar because both are directed to the same solution of improving machine learning by reducing computational resources and increasing model efficiency. Applicant does not provide any analysis of how the claim limitations in Desjardins are similar to the claim limitations of the present invention. The claimed improvement is not apparent because the claims as a whole fails to define the scope of the invention and the claims fail to distinctly point out the subject matter. One of the ordinary skills in the arts cannot clearly distinguish what constitutes as complex or not complex intermediate activation data from the claims. There is no degree to ascertain what constitutes as complex intermediate activation data. The claims are indefinite for failing to particularly point out and distinctly claim the subject matter. Thus, it would not be apparent to conclude that the claims are directed to a particular improvement.
Applicant’s Argument: On page 16-17 of Applicant’s response to rejections under 35 U.S.C. 101, applicant states “Similar to BASCOM, the present claims recite non-conventional and non-generic methods and systems, in this case for early exiting machine learning models when additional processing is unnecessary. For example, the Specification describes how processing complex data generally requires computationally intensive models that cannot be deployed on classes of devices with limited computational resources, such as mobile devices, edge devices, always-on devices, IoT devices, etc. Specification § [0005]. Applicant’s claimed solution improves upon these conventional techniques, for example, by allowing the model to exit early based on the complexity of the data, thereby saving computational resources and enabling these models to be deployed on lower power devices.”
Examiner’s Response: Applicant’s argument is not persuasive. Applicant argues that the technical improvement of the claimed invention allows for processing of complex data on devices with limited computational resources. Under Step 2B analysis, the examiner needs to determine if the claim includes an inventive concept and whether the claims recite significantly more than the abstract idea itself. Conventional techniques only allow complex data to be processed by computationally intensive models and devices. The examiner needs to evaluate the claims as a whole to determine if the innovative concept is presented in the recited claims. Claim 1 recites the steps of processing input data, making a determination based on complexity of the data, exiting the processing, and generating a classification result. The arrangement of the steps of the claimed invention does not lead one to believe that complex data can require fewer processing operations and still produce reliable results from the model. The claims do not explicitly define when the exiting of the operation occurs and it is unclear whether the process is exit when the data is complex or when the data is not complex. Additionally, the claimed improvement of allowing the model to exit early based on the determination of the complexity of the data is an abstract idea of a mental process. Therefore, the additional elements do not recite significantly more than the abstract idea itself and the claims are subject matter ineligible.
Applicant further argues that the claimed invention describes how the particular arrangement of steps provides a technical improvement. The claim does not disclose any specific arrangement or elements that are arranged in a specific manner rather the claim discloses a generic first gate, which can be broadly interpreted as a checkpoint or determination point because the claims does not provide any additional details to further limit the scope of the definition of a first gate.
Applicant’s Argument: On page 17-20 of Applicant’s response to rejections under 35 U.S.C. 102, applicant states “The Examiner argues that Zhang teaches “making a determination by a first gate, based on a complexity of the first intermediate activation data, whether or not to exit processing by the classification model”. However, in Zhang, the determination whether to perform an early exit is based on “a confidence level for a given video frame such that the output of the DNN [deep neural network] won’t lose an unacceptable level of accuracy. Thus, Zhang makes the gating determination based on confidence (i.e., probability of an accurate result), not complexity.
The Examiner further argues that it is “inherent that if a complex image is provided to a model, the model processes the complex image to generate intermediate complex features.” Applicant respectfully disagrees with this argument for several reasons. First, the Examiner provides no basis for such an assertion. On the contrary, Applicant submits that the behavior of a model depends on many different factors, including how the model is set up and what kind of data is accepted and processed. Thus, Applicant disagrees that the Examiner’s asserted feature is “inherent.” Second, even assuming arguendo that the statement is true, the analysis does not change. Whether or not “the model processes the complex image to generate intermediate complex features” does not teach or suggest anything about “making a determination by a first gate, based on a complexity of the first intermediate activation data, whether or not to exit processing by the classification model.” Therefore, Applicant maintains that Zhang fails to describe at least these features of the claims.”
Examiner’s Response: Applicant’s argument is not persuasive. Applicant argues that Zhang teaches that the gating determination is based on confidence rather than complexity as recited in the claims. The specification does not explicitly define the scope of the term complexity and the examiner has taken a broad interpretation of complexity to refer to any complex data. For example, Zhang (par. 55) teaches images can be simple when the biker and bike is entirely captured in the image because a processing system can easier recognize the image showing biking activity. A complex image may refer to images where the biker and bike are only partially shown in the image and it is harder for the processing system to recognize the image showing biking activity.
Zhang (par. 25) teaches a system with early exit branches and decision modules that performs computation on the input data and determines the complexity of the data using a confidence threshold. When the input data is a simple image, the system can easier classify the image without using a large number of computational resources and the confidence score of the classification will be high for simple images. For complex data, the system may require additional processing operations to classify a complex image and the confidence score for a complex image may not be very high because complex image would require additional computational operations for a more accurate classification. Therefore, Zhang teaches the determination to exit based on confidence and it is within the scope of the definition of complexity.
Applicant merely argues that confidence disclosed in Zhang is not complexity but does not provide any detailed explanation on what constitutes as complexity and how does confidence differ from complexity as taught in the claimed invention. Examiner suggests to amend the claims to further define the scope of what constitutes as complexity and include additional details to define how the determination is made based on complexity.
Applicant’s Argument: On page 20-23 of Applicant’s response to rejections under 35 U.S.C. 103, applicant states “In other words, while the probability to compute fine features (bt) may consider the state at the previous step, it is not checking for similarity (as one might see, for example, from calculating a difference between the current step and the previous step and then comparing the difference to a threshold). Accordingly, Wu does not teach or suggest “make a determination by a first gate based on a similarity of the first intermediate activation data from a current time step to previous intermediate activation data from a previous time step, wherein the first gate comprises a temporal comparison model” as recited in Claim 24.
In response to remarks previously presented, the Examiner further relies on language from Wu that states, “We observe that redundant frames without additional information are ignored and those selected frames provide salient information for recognizing the class of interest.” Final Office Action, p. 9 (citing Wu, pp. 6-7). However, simply saying that redundant frames “are ignored” does not mean that the model “make[s] a determination by a first gate to exit processing based on a similarity” as recited in Claim 24 (emphasis added). In other words, the mere fact that redundant frames are ignored may be a consequence of some decision-making process in Wu, but it does not teach that similarity is the basis for any such decision.”
Examiner’s Response: Applicant’s argument is not persuasive. The claim recites “make a determination by a first gate to exit processing of the classification model based on a similarity of the first intermediate activation data from a current time step to previous intermediate activation data from a previous time step, wherein the first gate comprises a temporal comparison model configured to compare the first intermediate activation data from the current time step to the previous intermediate activation data from the previous time step”. The claim limitation broadly recites that the determination is based on a similarity, which is a step of comparing data from a current time to data from a previous time. The claims do not provide a clear definition for the term similarity and there are no additional claim elements to further limit the scope of which the comparison is performed. Therefore, under the broadest reasonable interpretation, the determination based on a similarity can be very broadly defined because the steps in determining a similarity between 2 data is not explicitly recited. The determination of similarity cannot only be limited to calculating a difference between the current step and the previous step and then comparing the difference to a threshold.
Wu teaches in Figure 4 that only certain frames of a video is selected to be processed for fine features, while other frames are not processed by the fine LSTM model. Redundant frames are ignored and those frames that shows the most distinctive features receives additional processing by the system. It is obvious that in order to make a determination that an image is redundant, there needs to be a process of comparing two images to compute similarities and differences in order to conclude that an image is redundant. It is impossible to conclude that an image is redundant without first having to compare the image to a past or similar image. Under the broadest reasonable interpretation, Wu teaches making a determination based on similarity because the model in Wu requires the process of evaluating information of the current frame with historical data of previous frames to determine with additional processing is required for the new and distinctive features in the frame. It may be possible to overcome the 103 rejections if the claims are amended to include additional elements that further defines the scope in which the step of determining a similarity and making a comparison is performed.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 1-9, 11-23, and 25-29 rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
The term “complexity” in claims 1, 15, and 29 is a relative term which renders the claim indefinite. The term “complexity” is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. There is no degree to ascertain what constitutes as complex intermediate activation data. The claims need to be amended to provide additional details about some measure for determining complex and not complex intermediate activation data. The Examiner interprets “complexity” to be defined as a numerical metric of how well the classification model is able to classify an input data such as model accuracy or confidence scores. When a model can accurately classify data, processing can terminate and the output is generated. When a model cannot classify data accurately, additional processing is required to obtain a more accurate output. The determination of complexity can compare the score to a threshold to further determine whether or not to exit processing.
Claims 1, 15, and 29 recites “making a determination by a first gate, based on a complexity of the first intermediate activation data, whether or not to exit processing by the classification model; exiting the processing by the classification model before a final classifier in a plurality of classifiers of the classification model. The claims are unclear and indefinite because the claims fail to disclose what constitutes as performing the exiting the processing step. Does the classification model exit processing when the intermediate activation data is complex or is the exit processing step executed when data is not complex? The Examiner interprets the exiting the processing step to be executed when the determination by the first gate is that the intermediate activation data is not complex. The Specification (par. 23) aligns with this interpretation because less complex input data is handled earlier and does not require additional processing.
Claims 2-9 and 11-14 are dependent claims of independent claim 1. Claims 16-23 and 25-28 are dependent claims of independent claim 15. The dependent claims are rejected on the same basis as the parent independent claims.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1, 4-9, 11-15, and 18-29 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Regarding Claim 1:
Subject Matter Eligibility Analysis Step 1:
Claim 1 recites “A processor-implemented method of machine learning, comprising” and is thus a process, one of the four statutory categories of patentable subject matter.
Subject Matter Eligibility Analysis Step 2A Prong 1:
“making a determination classification model” (a mental process that can be performed in the human mind, i.e. judgement)
“exiting the processing by the classification model before a final classifier in a plurality of classifiers of the classification model” (a mental process that can be performed in the human mind, i.e. judgement)
“generating a classification result from one of the plurality of classifiers of the classification model” (a mental process that can be performed in the human mind, i.e. evaluation)
Claim 1 therefore recites an abstract idea.
Subject Matter Eligibility Analysis Step 2A Prong 2:
“processing input data in a first portion of a classification model to generate first intermediate activation data, the classification model being a machine learning model” (mere instructions to apply the exception using a generic computer component - see MPEP 2106.05(f))
“making a determination by a first gate, ” (mere instructions to apply the exception using a generic computer component - see MPEP 2106.05(f))
The additional elements as disclosed above alone or in combination do not integrate the judicial exception into practical application as they are mere insignificant extra solution activity in combination of generic computer functions being implemented with generic computer elements in a high level of generality to perform the disclosed abstract idea above. Therefore, Claim 1 is directed to the abstract idea.
Subject Matter Eligibility Analysis Step 2B:
“processing input data in a first portion of a classification model to generate first intermediate activation data, the classification model being a machine learning model” (mere instructions to apply the exception using a generic computer component - see MPEP 2106.05(f))
“making a determination by a first gate, ” (mere instructions to apply the exception using a generic computer component - see MPEP 2106.05(f))
The additional elements as disclosed above alone or in combination do not recite significantly more than the abstract idea itself as they are mere insignificant extra solution activity in combination of generic computer functions being implemented with generic computer elements in a high level of generality to perform the disclosed abstract idea above. Therefore, Claim 1 is subject-matter ineligible.
Regarding Claim 15:
The claim recites a system (“A processing system for machine learning”) that performs the method as described in claim 1. Therefore, claim 15 is rejected for the same reasons as disclosed for claim 1. The limitations for additional elements of claim 15 are analyzed below.
Subject Matter Eligibility Analysis Step 2A Prong 1:
Please see Step 2A Prong 1 analysis of claim 1
Subject Matter Eligibility Analysis Step 2A Prong 2 & 2B:
“at least one memory comprising computer-executable instructions” (mere instructions to apply the exception using a generic computer component - see MPEP 2106.05(f))
“one or more processors configured to execute the computer-executable instructions and cause the processing system to” (mere instructions to apply the exception using a generic computer component - see MPEP 2106.05(f))
Regarding Claims 4 and 18:
Subject Matter Eligibility Analysis Step 2A Prong 1:
“the determination ” (a mental process that can be performed in the human mind, i.e. judgement)
Subject Matter Eligibility Analysis Step 2A Prong 2 & 2B:
“the determination by the first gate ” (mere instructions to apply the exception using a generic computer component - see MPEP 2106.05(f))
“the method further comprises processing the first intermediate activation data with a first classifier of the plurality of classifiers to generate the classification result” (mere instructions to apply the exception using a generic computer component - see MPEP 2106.05(f))
“each of the plurality of classifiers is associated with a model portion” (merely specifies a particular technological environment in which the abstract idea is to take place, ie. a field of use, and thus does not integrate the abstract idea into a practical application nor cannot provide significantly more than the abstract idea itself - see MPEP 2106.05(h))
“the classification model comprises a directional sequence of model portions” (merely specifies a particular technological environment in which the abstract idea is to take place, ie. a field of use, and thus does not integrate the abstract idea into a practical application nor cannot provide significantly more than the abstract idea itself - see MPEP 2106.05(h))
Regarding Claims 5 and 19:
Subject Matter Eligibility Analysis Step 2A Prong 1:
“” (a mental process that can be performed in the human mind, i.e. judgement)
Subject Matter Eligibility Analysis Step 2A Prong 2 & 2B:
“the determination by the first gate ” (mere instructions to apply the exception using a generic computer component - see MPEP 2106.05(f))
“the method further comprises providing the first intermediate activation data to a second portion of the classification model” (This step is directed to data gathering, which is understood to be insignificant extra solution activity (2106.05(g) in step 2A prong 2) and well understood, routine and conventional activity of transmitting and receiving data as identified by the court (2106.05(d) in step 2B))
“each of the plurality of classifiers is associated with a model portion” (merely specifies a particular technological environment in which the abstract idea is to take place, ie. a field of use, and thus does not integrate the abstract idea into a practical application nor cannot provide significantly more than the abstract idea itself - see MPEP 2106.05(h))
“the classification model comprises a directional sequence of model portions” (merely specifies a particular technological environment in which the abstract idea is to take place, ie. a field of use, and thus does not integrate the abstract idea into a practical application nor cannot provide significantly more than the abstract idea itself - see MPEP 2106.05(h))
Regarding Claims 6 and 20:
Subject Matter Eligibility Analysis Step 2A Prong 1:
“making a determination based on the second intermediate activation data, whether or not to exit processing by the classification model” (a mental process that can be performed in the human mind, i.e. judgement)
Subject Matter Eligibility Analysis Step 2A Prong 2 & 2B:
“processing the first intermediate activation data by the second portion of the classification model to generate second intermediate activation data” (mere instructions to apply the exception using a generic computer component - see MPEP 2106.05(f))
“providing the second intermediate activation data to a second gate” (This step is directed to data gathering, which is understood to be insignificant extra solution activity (2106.05(g) in step 2A prong 2) and well understood, routine and conventional activity of transmitting and receiving data as identified by the court (2106.05(d) in step 2B))
“making a determination by the second gate, ” (mere instructions to apply the exception using a generic computer component - see MPEP 2106.05(f))
Regarding Claims 7 and 21:
Subject Matter Eligibility Analysis Step 2A Prong 1: None
Subject Matter Eligibility Analysis Step 2A Prong 2 & 2B:
“the input data comprises image data” (merely specifies a particular technological environment in which the abstract idea is to take place, ie. a field of use, and thus does not integrate the abstract idea into a practical application nor cannot provide significantly more than the abstract idea itself - see MPEP 2106.05(h))
“the classification model comprises an image classification model” (merely specifies a particular technological environment in which the abstract idea is to take place, ie. a field of use, and thus does not integrate the abstract idea into a practical application nor cannot provide significantly more than the abstract idea itself - see MPEP 2106.05(h))
Regarding Claims 8 and 22:
Subject Matter Eligibility Analysis Step 2A Prong 1:
“” (a mathematical calculation, see Spec. par. 58)
Subject Matter Eligibility Analysis Step 2A Prong 2 & 2B:
“wherein the first gate has been trained ” (mere instructions to apply the exception using a generic computer component - see MPEP 2106.05(f))
Regarding Claims 9 and 23:
Subject Matter Eligibility Analysis Step 2A Prong 1: None
Subject Matter Eligibility Analysis Step 2A Prong 2 & 2B:
“wherein the first gate comprises a temporal comparison model configured to compare the first intermediate activation data from a current time step to previous intermediate activation data from a previous time step” (mere instructions to apply the exception using a generic computer component - see MPEP 2106.05(f))
Regarding Claims 24:
Subject Matter Eligibility Analysis Step 1:
Claim 24 recites “A processing system for machine learning comprising” and is thus a machine, one of the four statutory categories of patentable subject matter.
Subject Matter Eligibility Analysis Step 2A Prong 1:
“make a determination time step to previous intermediate activation data from a previous time step, ” (a mental process that can be performed in the human mind, i.e. judgement)
“exiting the processing of the classification model based on the determination by the first gate” (a mental process that can be performed in the human mind, i.e. judgement)
“generating a classification result from one of a plurality of classifiers of the classification model” (a mental process that can be performed in the human mind, i.e. judgement)
Claim 24 therefore recites an abstract idea.
Subject Matter Eligibility Analysis Step 2A Prong 2:
“at least one memory comprising computer-executable instructions” (mere instructions to apply the exception using a generic computer component - see MPEP 2106.05(f))
“one or more processors configured to execute the computer-executable instructions and cause the processing system to” (mere instructions to apply the exception using a generic computer component - see MPEP 2106.05(f))
“processing input data in a first portion of a classification model to generate first intermediate activation data, the classification model being a machine learning model” (mere instructions to apply the exception using a generic computer component - see MPEP 2106.05(f))
“make a determination by a first gate ” (mere instructions to apply the exception using a generic computer component - see MPEP 2106.05(f))
“output the classification result” (This step is directed to data gathering, which is understood to be insignificant extra solution activity - see MPEP 2106.05(g))
The additional elements as disclosed above alone or in combination do not integrate the judicial exception into practical application as they are mere insignificant extra solution activity in combination of generic computer functions being implemented with generic computer elements in a high level of generality to perform the disclosed abstract idea above. Therefore, Claim 24 is directed to the abstract idea.
Subject Matter Eligibility Analysis Step 2B:
“at least one memory comprising computer-executable instructions” (mere instructions to apply the exception using a generic computer component - see MPEP 2106.05(f))
“one or more processors configured to execute the computer-executable instructions and cause the processing system to” (mere instructions to apply the exception using a generic computer component - see MPEP 2106.05(f))
“processing input data in a first portion of a classification model to generate first intermediate activation data, the classification model being a machine learning model” (mere instructions to apply the exception using a generic computer component - see MPEP 2106.05(f))
“make a determination by a first gate ” (mere instructions to apply the exception using a generic computer component - see MPEP 2106.05(f))
“output the classification result” (This step is directed to transmitting or receiving information, which is understood to be insignificant extra solution activity and well understood, routine and conventional activity of gathering and analyzing information using conventional techniques and displaying the result as identified by the court - see MPEP 2106.05(d))
The additional elements as disclosed above alone or in combination do not recite significantly more than the abstract idea itself as they are mere insignificant extra solution activity in combination of generic computer functions being implemented with generic computer elements in a high level of generality to perform the disclosed abstract idea above. Therefore, Claim 24 is subject-matter ineligible.
Regarding Claims 11 and 25:
Subject Matter Eligibility Analysis Step 2A Prong 1:
“making another determination ” (a mental process that can be performed in the human mind, i.e. judgement)
“making a determination ” (a mental process that can be performed in the human mind, i.e. judgement)
Subject Matter Eligibility Analysis Step 2A Prong 2 & 2B:
“making another determination by the first gate ” (mere instructions to apply the exception using a generic computer component - see MPEP 2106.05(f))
“processing the first intermediate activation data by a second portion of the classification model to generate second intermediate activation data” (mere instructions to apply the exception using a generic computer component - see MPEP 2106.05(f))
“making a determination by a second gate, ” (mere instructions to apply the exception using a generic computer component - see MPEP 2106.05(f))
Regarding Claims 12 and 26:
Subject Matter Eligibility Analysis Step 2A Prong 1:
“” (a mental process that can be performed in the human mind, i.e. judgement)
Subject Matter Eligibility Analysis Step 2A Prong 2 & 2B:
“the determination by the second gate ” (mere instructions to apply the exception using a generic computer component - see MPEP 2106.05(f))
“the method further comprises processing the second intermediate activation data with a first classifier of the plurality of classifiers to generate the classification result” (mere instructions to apply the exception using a generic computer component - see MPEP 2106.05(f))
Regarding Claims 13 and 27:
Subject Matter Eligibility Analysis Step 2A Prong 1:
“further comprising convolving the first intermediate activation data using one or more convolution layers prior to ” (a mathematical calculation)
“” (a mental process that can be performed in the human mind, i.e. judgement)
Subject Matter Eligibility Analysis Step 2A Prong 2 & 2B: None
Regarding Claims 14 and 28:
Subject Matter Eligibility Analysis Step 2A Prong 1: None
Subject Matter Eligibility Analysis Step 2A Prong 2 & 2B:
“the input data comprises video data” (merely specifies a particular technological environment in which the abstract idea is to take place, ie. a field of use, and thus does not integrate the abstract idea into a practical application nor cannot provide significantly more than the abstract idea itself - see MPEP 2106.05(h))
“the classification model comprises an video classification model” (merely specifies a particular technological environment in which the abstract idea is to take place, ie. a field of use, and thus does not integrate the abstract idea into a practical application nor cannot provide significantly more than the abstract idea itself - see MPEP 2106.05(h))
Regarding Claim 29:
The claim recites an article of manufacture that performs the method as described in claim 1. Therefore, claim 29 is rejected for the same reasons as disclosed for claim 1. The limitations for additional elements of claim 29 are analyzed below.
Subject Matter Eligibility Analysis Step 2A Prong 1:
Please see Step 2A Prong 1 analysis of claim 1
Subject Matter Eligibility Analysis Step 2A Prong 2 & 2B:
“A non-transitory computer-readable medium comprising computer- executable instructions that, when executed by one or more processors of a processing system, cause the processing system to perform a method of machine learning, the method comprising” (mere instructions to apply the exception using a generic computer component - see MPEP 2106.05(f))
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.
Claims 1, 4-7, 13, 15, 18-21, 27, and 29 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Zhang (US20210056357A1).
Regarding claim 1, Zhang teaches:
“A processor-implemented method of machine learning, comprising” ([0015], The system contains a processor to perform the methods of implementing input-adaptive neural networks. Input data is processed using a neural network.)
“processing input data in a first portion of a classification model to generate first intermediate activation data, the classification model being a machine learning model” ([0063-0070, 0105, Figure 15], A CNN can receive images as input data in the input layers. The CNN may include feature-extraction layers that consist of convolution and activations layer to obtain useful features from the input data for processing. A convolution layer applies a convolution operation to the input data and outputs an activation map. The CNN model consists of classification layers that classify the input image into various classes.)
“making a determination by a first gate, based on a complexity of the first intermediate activation data, whether or not to exit processing by the classification model” ([0083, 0105, 0116, Figure 15], The smaller early exit branch network takes the intermediate features generated by the internal convolutional layers of the base model and transforms them into early predictions. The decision module (first gate) then takes the early prediction results generated by the early exit branch and makes a decision on whether to exit the inference process and output the early prediction results, or to continue the inference process and pass the generated feature maps to the next layer. The early exit branch allows for a model to be flexible and adaptive to the difficulty of the input data. In one embodiment, a video frame is exit if its entropy-based confidence score (complexity) is higher than a pre-determined threshold.)
“exiting the processing by the classification model before a final classifier in a plurality of classifiers of the classification model” ([0105, 0142-144, Figure 15], The early exit branch is a small-size neural network (classifier) and contains convolutional, activation, pooling, and fully-connected layers. Each early exit consists of an early exit branch and decision modules. A neural network may have one or more early exits. When an easy input data is processed by the neural network, the first early exit point can determine an acceptable level of inference confidence and output the results without further processing until the information reaches the final exit point.)
“generating a classification result from one of the plurality of classifiers of the classification model” ([0105, 0117, Figure 15], Multiple early exit branches may be places in the DNN architecture and each early exit branch is a small-size neural network. When the confidence score of the early exit prediction results meets a threshold, the model may exit the inference process and output the early prediction results.)
Regarding claim 15:
Claim 15 recites a system (“A processing system for machine learning”) that performs the same process as described in Claim 1. Therefore claim 15 is rejected under the same reasons mention for claim 1. The additional elements of claim 15 is addressed below:
“at least one memory comprising computer-executable instructions” ([0015], The system consist of a memory having a set of instructions stored.)
“one or more processors configured to execute the computer-executable instructions and cause the processing system to” ([0015], The system includes a processor, at least one data input source, a memory having a set of instructions stored thereon, which when executed by the processor, cause the system to acquire a series of input data units from the data input source, process a first data unit of the series using an input-adaptive neural network.)
Regarding claims 4 and 18, Zhang teaches:
“the determination by the first gate comprises a determination to exit processing of the classification model” ([0105, 0143-0144, Figure 7A], There may be one or more early exit components. The first early exit point may determine whether the model has reached an acceptable level of inference confidence.)
“the method further comprises processing the first intermediate activation data with a first classifier of the plurality of classifiers to generate the classification result” ([0105,0143-0144, Figure 7A], The first early exit branch may process the intermediate data from the base layers. The first early exit branch may process the data and output a classification result from the neural network.)
“each of the plurality of classifiers is associated with a model portion” ([0105,0145-0146, Figure 7B], Figure 7A shows early exit 108A receiving input data from base layer 104B and early exit 108B receiving input from a different portion of the model, base layer 104C.)
“the classification model comprises a directional sequence of model portions” ([0105,0143-0146, Figure 7B], A standard data processing flowpath 103 represents how the neural network 100 would classify the input 102 without the early exits 108-B, passing data through every base layer 104-D regardless of input, culminating in a final exit output 110. Figure 7 shows a sequential progression of the data from one layer to the next.)
Regarding claims 5 and 19, Zhang teaches:
“the determination by the first gate comprises a determination to continue processing of the classification model” ([0105, 0145-0146, Figure 7B], When the model has not achieved a threshold level of inference confidence at the first early exit point, the data processing flow path continues until it reaches an early exit point where the condition is met.)
“the method further comprises providing the first intermediate activation data to a second portion of the classification model” ([0105,0145-0146, Figure 7B], Figure 7B shows that the model did not exit early at 108A and the intermediate data continues to be process by sending the data to the next layer 104C (second portion of the classification model).)
“each of the plurality of classifiers is associated with a model portion” ([0105,0145-0146, Figure 7B], Figure 7B shows early exit 108A receiving input data from base layer 104B and early exit 108B receiving input from a different portion of the model, base layer 104C.)
“the classification model comprises a directional sequence of model portions” ([0105,0143-0146, Figure 7B], A standard data processing flow path 103 represents how the neural network 100 would classify the input 102 without the early exits 108-B, passing data through every base layer 104-D regardless of input, culminating in a final exit output 110. Figure 7B shows a sequential progression of the data from one layer to the next.)
Regarding claims 6 and 20, Zhang teaches:
“processing the first intermediate activation data by the second portion of the classification model to generate second intermediate activation data” ([0105,0145-0146, Figure 7B], Figure 7B shows that the model did not exit early at 108A and the intermediate data from layer 104B (first intermediate activation data) continues to be process by sending the data to the next layer 104C (second portion of the classification model).)
“providing the second intermediate activation data to a second gate” ([0105,0145-0146, Figure 7B], Figure 7B shows layer 104C processing the intermediate data from the previous layer and sending the output (second intermediate activation data) to the early exit point 108B.)
“making a determination by the second gate, based on the second intermediate activation data, whether or not to exit processing by the classification model” ([0105, 0145-0146, Figure 7B], When the early exit branch 108B (second gate) receives the second intermediate data from base layer 104C as input, the network will process the information and determine an inference confidence score. The confidence score determines if the model can exit or not.)
Regarding claims 7 and 21, Zhang teaches:
“the input data comprises image data” ([0063,0116, Figure 1A], Input data can be image data.)
“the classification model comprises an image classification model” ([0062-0063], The neural network can be convolutional neural network for image analysis.)
Regarding claims 13 and 27, Zhang teaches:
“further comprising convolving the first intermediate activation data using one or more convolution layers prior to making the determination based on the complexity of the first intermediate activation data” ([0069-0070, 0105], The base model consist of convolutional layers that processes the input data and aggregate the outputs prior to providing the intermediate data to the early exit architecture. The smaller early exit branch network takes the intermediate features generated by the internal convolutional layers of the base model and transforms them into early predictions.)
Regarding claim 29:
Claim 29 recites a program product that performs the same process as described in Claim 1. Therefore claim 29 is rejected under the same reasons mention for claim 1. The additional elements of claim 29 is addressed below:
“A non-transitory computer-readable medium comprising computer- executable instructions that, when executed by one or more processors of a processing system, cause the processing system to perform a method of machine learning, the method comprising” ([0015], The system includes a processor, at least one data input source, a memory having a set of instructions stored thereon, which when executed by the processor, cause the system to acquire a series of input data units from the data input source, process a first data unit of the series using an input-adaptive neural network.)
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 2-3, 8, 16-17, and 22 are rejected under 35 U.S.C. 103 as being unpatentable over Zhang (US20210056357A1) in view of Herrmann, "An End-To-End Approach For Speeding Up Neural Network Inference".
Regarding claims 2 and 16, Zhang teaches:
“wherein the first gate comprises: a pooling layer configured to reduce a dimensionality of the first intermediate activation data” ([0086-0088,0105], The early exit branch may consist of pooling layers to reduce the spatial size of the representation.)
“one or more neural network layers configured to generate the determination of whether or not to exit processing by the classification model” ([0105], The early exit branch may consist of convolutional, activation, pooling, and fully-connected layers to generate early predictions. The decision module then takes the early prediction results generated by the early exit branch and makes a decision on whether to exit the inference process and output the early prediction results.)
“a ” ([0116], Zhang does not explicitly disclose a Gumbel distribution, but Zhang does disclose using a softmax classification probability distribution to determine the confidence score of an early exit.)
Zhang does not explicitly disclose an implementation of “a Gumbel sampling component”. However, Herrmann discloses in the same field of endeavor:
“a Gumbel sampling component” ([pg. 9, section 5, par. 1-2], The model uses Gumbel distribution to make a decision as to whether the instance can be successfully classified at that stage and whether an early exit is necessary at the intermediate classifier.)
It would be obvious to one of the ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of a “Gumbel sampling component” from Herrmann into the teaching of Zhang. Doing so can improve the inference speed of a neural network by using Gumbel reparameterization (Herrmann, abstract).
Regarding claims 3 and 17, Zhang teaches:
“wherein the one or more neural network layers comprise a plurality of multi-layer perceptron layers” ([0089,0105], The fully connected layers may be feed-forward multi-layer perceptron layers. The early exit branch may consist of fuller connected layers.)
Regarding claims 8 and 22, Zhang teaches:
“wherein the first gate has been trained using a ” ([0111-0112, 0131, 0135], The early exit branch may consist of a loss function to minimize the model’s error. The system may modify the model’s accuracy to account for resource demands and availability.)
Zhang does not explicitly disclose an implementation of “a batch-shaping loss function”. However, Herrmann discloses in the same field of endeavor:
“wherein the first gate has been trained using a batch-shaping loss function to minimize classification error and to minimize processing resource usage” ([pg. 3, section 3, par. 1-2], The approach uses a per-batch activation loss function.)
It would be obvious to one of the ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of a “a batch-shaping loss function” from Herrmann into the teaching of Zhang. Doing so can improve the inference speed of a neural network by using batch activation loss function (Herrmann, abstract).
Claims 9, 11-12, 14, 23-26, and 28 are rejected under 35 U.S.C. 103 as being unpatentable over Zhang (US20210056357A1) in view of Wu, "LiteEval: A Coarse-to-Fine Framework for Resource Efficient Video Recognition".
Regarding claims 9 and 23, Zhang teaches:
“wherein the first gate comprises a ” ([0025, 0105], The early exit branch may consist of a neural network to process the intermediate features from the internal convolutional layers of the base model.)
Zhang does not explicitly disclose an implementation of “wherein the first gate comprises a temporal comparison model configured to compare the first intermediate activation data from a current time step to previous intermediate activation data from a previous time step”. However, Wu discloses in the same field of endeavor:
“wherein the first gate comprises a temporal comparison model configured to compare the first intermediate activation data from a current time step to previous intermediate activation data from a previous time step” ([pg. 3, section 2.1, par. 1-3, Figure 1], The system consists of gating modules and temporal modeling that processes and computes the features of the incoming video frames (current time step) and features from a previous time step. The framework consists of a coarse and a fine LSTM, which is a recurrent neural network that compares current input data with previous hidden states to process sequence.)
It would be obvious to one of the ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of a “wherein the first gate comprises a temporal comparison model configured to compare the first intermediate activation data from a current time step to previous intermediate activation data from a previous time step” from Wu into the teaching of Zhang. Doing so can improve the computation efficiency of the classification model by implementing the method of adjusting the computing power for an incoming video frame (Wu, abstract).
Regarding claim 24, Zhang teaches:
“A processing system for machine learning, comprising: at least one memory comprising computer-executable instructions” ([0015], The system consist of a memory having a set of instructions stored to perform the methods of implementing input-adaptive neural networks. Input data is processed using a neural network.)
“one or more processors configured to execute the computer-executable instructions and cause the processing system to” ([0015], The system includes a processor, at least one data input source, a memory having a set of instructions stored thereon, which when executed by the processor, cause the system to acquire a series of input data units from the data input source, process a first data unit of the series using an input-adaptive neural network.)
“processing input data in a first portion of a classification model to generate first intermediate activation data, the classification model being a machine learning model” ([0063-0070, 0105, Figure 15], A CNN can receive images as input data in the input layers. The CNN may include feature-extraction layers that consist of convolution and activations layer to obtain useful features from the input data for processing. A convolution layer applies a convolution operation to the input data and outputs an activation map. The CNN model consists of classification layers that classify the input image into various classes.)
“make a determination by a first gate to exit processing of the classification model based on ” ([0105, 0116], The decision module makes a decision on whether to exit the inference process of the classification model or not based on processing the intermediate features of the internal convolutional layers of the base model. The early exit branch is a small-size neural network. The decision module compares the confidence level of the early exit branch to a threshold to determine if an early exit is appropriate.)
“exiting the processing of the classification model based on the determination by the first gate” ([0105, 0142-144, Figure 15], Each early exit consists of an early exit branch and decision modules. A neural network may have one or more early exits. When an easy input data is processed by the neural network, the first early exit point can determine an acceptable level of inference confidence and output the results without further processing until the information reaches the final exit point.)
“generating a classification result from one of a plurality of classifiers of the classification model” ([0105, 0117, Figure 15], Multiple early exit branches may be places in the DNN architecture and each early exit branch is a small-size neural network. When the confidence score of the early exit prediction results meets a threshold, the model may exit the inference process and output the early prediction results.)
“output the classification result” ([0144, Figure 7A], The early exit branch may determine it has already reached an acceptable level of inference confidence and output the results of the classification model. Figure 7A shows the early exit output 112.)
Zhang does not explicitly disclose an implementation of “make a determination by a first gate to exit processing of the classification model based on a similarity of the first intermediate activation data from the current time step to previous intermediate activation data from the previous time step, wherein the first gate comprises a temporal comparison model configured to compare the first intermediate activation data from the current time step to the previous intermediate activation data from the previous time step”. However, Wu discloses in the same field of endeavor:
“make a determination by a first gate to ” ([pg. 2, par. 1-2; pg. 3, section 2, par. 1-4, Figure 1 & 4], The gating module computes a probability and based on the computed probability, determines whether to process the fine features of the video frame or skip the processing step. The system decides to process the fine features of the frame in order to make better predictions of the label, such as distinguishing “making latte” from “making “cappuccino. The framework is to learn a policy that determines at each time step whether an input video frame contains discriminative features by computing current and historical features. The claims do not explicitly recite additional information to limit the scope of what constitutes as a similarity between intermediate activation data and how the comparison is being performed. Under the broadest reasonable interpretation, the LSTM proposed by Zhang teaches similarity because the LSTM analyze sequences of image features to determine spatial similarity.)
It would be obvious to one of the ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of a “make a determination by a first gate to exit processing of the classification model based on a similarity of the first intermediate activation data from the current time step to previous intermediate activation data from the previous time step, wherein the first gate comprises a temporal comparison model configured to compare the first intermediate activation data from the current time step to the previous intermediate activation data from the previous time step” from Wu into the teaching of Zhang. Doing so can improve the computation efficiency of the classification model by implementing the method of adjusting the computing power for an incoming video frame (Wu, abstract).
Regarding claims 11 and 25, Zhang teaches:
“making another determination by the first gate based on ” ([0105], The decision module makes a decision on whether to exit the inference process of the classification model or not based on processing the intermediate features of the internal convolutional layers of the base model. The decision module may continue the inference process and pass the generated feature maps to the next layer.)
“processing the first intermediate activation data by a second portion of the classification model to generate second intermediate activation data” ([0066, 0105, 0145-0146, Figure 7B], The feature-extraction layers of the model find a number of features in the image input data. The model complexity can be based on number of features, data type, and data size. The data may continue to be process by the next portion of the network if it is determined that more inference is required to process the input data.)
“making a determination by a second gate, based on the second intermediate activation data, whether or not to exit processing by the classification model” ([0144, Figure 7B], The early exit branch may determine it has already reached an acceptable level of inference confidence and output the results of the classification model. Figure 7B shows the second early exit branch 108B and it continues to process the input data after it was determined by early exit branch 108A that the inference needs to continue.)
Zhang does not explicitly disclose an implementation of “making another determination by the first gate based on a dissimilarity of the first intermediate activation data from the current time step to previous intermediate activation data from the previous time step”. However, Wu discloses in the same field of endeavor:
“making another determination by the first gate based on a dissimilarity of the first intermediate activation data from the current time step to previous intermediate activation data from the previous time step, wherein the determination by the first gate comprises a determination to continue processing by the classification model” ([pg. 2, par. 1-2; pg. 3, section 2, par. 1-4, Figure 1 & 4], The gating module computes a probability and based on the computed probability, determines whether to process the fine features of the video frame or skip the processing step. The system decides to process the fine features of the frame in order to make better predictions of the label, such as distinguishing “making latte” from “making “cappuccino. The framework is to learn a policy that determines at each time step whether an input video frame contains discriminative features by computing current and historical features. The claims do not explicitly recite additional information to limit the scope of what constitutes as a dissimilarity between intermediate activation data. Under the broadest reasonable interpretation, the LSTM proposed by Zhang teaches dissimilarity because the LSTM analyze sequences of image features to determine spatial similarity. When distinct features are determined, the gating module outputs a high probability to compute fine features that requires additional processing.)
It would be obvious to one of the ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of a “making another determination by the first gate based on a dissimilarity of the first intermediate activation data from the current time step to previous intermediate activation data from the previous time step, wherein the determination by the first gate comprises a determination to continue processing by the classification model” from Wu into the teaching of Zhang. Doing so can improve the computation efficiency of the classification model by implementing the method of adjusting the computing power for an incoming video frame (Wu, abstract).
Regarding claims 12 and 26, Zhang teaches:
“the determination by the second gate comprises a determination to exit processing of the classification model” ([0105], The decision module makes a decision on whether to exit the inference process of the classification model or not based on processing the intermediate features of the internal convolutional layers of the base model. Each early exit branch, including the second early exit branch can determine to exit the process or not.)
“the method further comprises processing the second intermediate activation data with a first classifier of the plurality of classifiers to generate the classification result” ([0105,0143-0144, Figure 7B], The early exit branch may process the intermediate data from the base layers. The early exit branch may process the data and output a classification result from the neural network. In Figure 7B, early exit branch 108A does not exit the inference process and the data continues to be processed by the next portion of the classifier and the next early exit branch 108B.)
Regarding claims 14 and 28, Zhang teaches:
“the input data comprises video data” ([0058], Video streams can be processed by the system.)
“the classification model comprises a video classification model” ([0062-0063, 0151], The CNN can be used for video analysis such as labeling of human activities in videos.)
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to GARY MAC whose telephone number is (703)756-1517. The examiner can normally be reached Monday - Friday 8:00 AM - 5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Abdullah Kawsar can be reached on (571) 270-3169. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/GARY MAC/Examiner, Art Unit 2127
/ABDULLAH AL KAWSAR/Supervisory Patent Examiner, Art Unit 2127