Office Action Analysis: 18447347 — MACHINE LEARNING DEVICE, MACHINE LEARNING METHOD, AND NON-TRANSITORY COMPUTER-READABLE RECORDING MEDIUM HAVING EMBODIED THEREON A MACHINE LEARNING PROGRAM

Office Action

§101 §102 §103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of Claims
Claims 1-6 are pending and examined herein.
Claim 4 is objected to.
Claims 1-6 are rejected under 35 U.S.C. 101.
Claims 1-6 are rejected under 35 U.S.C. 103.

Information Disclosure Statement
The attached information disclosure statement(s) (IDS) filed on 10/11/2023, 7/08/2024, and 3/21/2025 is/are in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement(s) is/are being considered by the examiner.

Claim Objections
Claim 4 objected to because of the following informalities:
“full-connected” should be “fully-connected”
Appropriate correction is required.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-4 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter.  
The claim(s) does/do not fall within at least one of the four categories of patent eligible subject matter because they recite a product that does not have a physical/tangible form, and does not have any structural limitations, otherwise known as “software per se”. See MPEP § 2106.03(I). Examiner recommends the claims be amended to include at least a processor and a memory.

Claims 1-6 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
MPEP § 2109(III) sets out steps for evaluating whether a claim is drawn to patent-eligible subject
matter. The analysis of claims 1-6, in accordance with these steps, follows.

Step 1 Analysis:
Step 1 is to determine whether the claim is directed to a statutory category (process, machine,
manufacture, or composition of matter. Claims 1-4, if amended, would be directed to a machine, claim 5 is directed to an article of manufacture, and claim 6 is directed to a process. All claims, if amended, would be directed to statutory categories and analysis proceeds.

Step 2A Prong One, Step 2A Prong Two, and Step 2B Analysis:
Step 2A Prong One asks if the claim recites a judicial exception (abstract idea, law of nature, or natural phenomenon). If the claim recites a judicial exception, analysis proceeds to Step 2A Prong Two, which asks if the claim recites additional elements that integrate the abstract idea into a practical application. If the claim does not integrate the judicial exception, analysis proceeds to Step 2B, which asks if the claim amounts to significantly more than the judicial exception. If the claim does not amount to significantly more than the judicial exception, the claim is not eligible subject matter under 35 U.S.C. 101.
	None of the claims represent an improvement to technology.

Regarding claim 1, the following are abstract ideas:
a domain adaptation data richness determination unit that, when a first model trained by using training data of a first domain is trained by transfer learning by using training data of a second domain, determines a domain adaptation data richness based on the number of items of training data of the second domain, the first model being a neural network; (Determining a domain adaptation data richness based on the number of items of training data of the second domain when a model is trained can be practically performed in the human mind. This is a mental process.)
a learning layer determining unit that determines a layer in the second model, which is a duplicate of the first model, targeted for training, based on the domain adaptation data richness; and a (Determining a layer based on the domain adaptation data richness can be practically performed in the human mind. This is a mental process.)
The following claim elements are additional elements which, taken alone or in combination with the other additional elements, do not integrate the judicial exception into a practical application nor amount to significantly more than the judicial exception:
A machine learning device comprising: (This recites generic machine learning components, which amounts to mere instructions to apply an exception.)
transfer learning unit that applies transfer learning to the layer in the second model targeted for training, by using the training data of the second domain. (This recites generic machine learning processes and components, which amounts to mere instructions to apply an exception.)

Regarding claim 2, the rejection of claim 1 is incorporated herein. Further, the following is an abstract idea:
wherein the learning layer determination unit ensures that the higher the domain adaptation data richness, the larger the number of layers targeted for training, and the lower the domain adaptation data richness, the smaller the number of layers targeted for training. (Ensuring that the number of layers targeted for training is larger for higher numbers and lower for lower numbers can be practically performed in the human mind. This is a mental process.)

Regarding claim 3, the rejection of claim 1 is incorporated herein. Further, the following is an abstract idea:
wherein the learning layer determination unit includes more of layers near an input layer as layers targeted for training, as the domain adaptation data richness becomes higher. (Including/determining to include more layers near an input layer can be practically performed in the human mind. This is a mental process.)

Regarding claim 4, the rejection of claim 1 is incorporated herein. Further, the following is an abstract idea:
wherein the learning layer determination unit determines only full-connected layers to be layers targeted for training when the domain adaptation data richness is equal to or lower than a predetermined value. (Determining which layers to choose based on the domain adaptation data richness can be practically performed in the human mind. This is a mental process.)
 
Regarding claim 5, the following claim elements are additional elements which, taken alone or in combination with the other additional elements, do not integrate the judicial exception into a practical application nor amount to significantly more than the judicial exception:
A machine learning method comprising: (This recites generic machine learning processes and components, which amounts to mere instructions to apply an exception.)
The remainder of claim 5 recites substantially similar subject matter to claim 1 and is rejected with the same rationale, mutatis mutandis.

Regarding claim 6, the following claim elements are additional elements which, taken alone or in combination with the other additional elements, do not integrate the judicial exception into a practical application nor amount to significantly more than the judicial exception:
A non-transitory computer-readable recording medium having embodied thereon a machine learning program comprising computer-implemented modules including: (This recites generic computer components and processes and generic machine learning processes and components, which amount to mere instructions to apply an exception.)
The remainder of claim 6 recites substantially similar subject matter to claim 1 and is rejected with the same rationale, mutatis mutandis.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 1-6 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Soekhoe (“On the Impact of Data Set Size in Transfer Learning Using Deep Neural Networks”, 2016).
This reference was made available by Applicant via IDS.

Regarding claim 1, Soekhoe teaches
A machine learning device comprising: (Page 55 states "To conduct our experiments, we use the Caffe deep learning framework developed at UC Berkeley [7]. We make use of a single Nvidida GTX Titan X graphics card to enable Caffe in GPU mode, to speed up our training time. We use the AlexNet reference model which is included in Caffe." The system used to execute the method is interpreted as the machine learning device.)
a domain adaptation data richness determination unit that, when a first model trained by using training data of a first domain is trained by transfer learning by using training data of a second domain, determines a domain adaptation data richness based on the number of items of training data of the second domain, the first model being a neural network; (Page 50 states "The objective of transfer learning is to use knowledge of a source task and transfer that to a new target task [10]." The source task is interpreted as the first domain and the target task is interpreted as the second domain. Page 54 states "We randomly split the entire data set into a source and a target partition, Nsource and Ntarget respectively, where each partition contains 50,000 images."  Page 55 states "To create a model from which we can transfer the features, we first train our network on Nsource. The parameters of the source model are stored in a Caffemodel object (see Sect. 3.4), which we use to transfer the parameters from the source model to the target model." Therefore, the first model is trained by the training data of a first domain, and as the parameters are transferred to another model, is trained by the training data of the target domain. Page 52 states "We will transfer features from a CNN trained on a source task, to a target task, i.e. data sets with disjunct outcome classes." Therefore the models are neural networks. Page 50 states "For the first n instances of a new class, freeze the first l layers of the network. Once you have obtained more than n instances for new class, training can simply affect all layers. Obviously the values for n and l depend on the data and task at hand, in our experiments freezing the first 3 layers until 300 (Tiny-ImageNet) and respectively 900 (MiniPlaces2) instances per class gave the best results." The number of instances for the new class/target domain is interpreted as the domain adaptation data richness.)
a learning layer determining that determines a layer in the second model, which is a duplicate of the first model, targeted for training, based on the domain adaptation data richness; and (Page 50 states "For the first n instances of a new class, freeze the first l layers of the network. Once you have obtained more than n instances for new class, training can simply affect all layers. Obviously the values for n and l depend on the data and task at hand, in our experiments freezing the first 3 layers until 300 (Tiny-ImageNet) and respectively 900 (MiniPlaces2) instances per class gave the best results." Page 55 states "When we transfer the parameters to the target model, we keep them fixed. That is to say, we do not update the parameters by gradient descent. The remaining 8 − l layers of the network we randomly initialize and let the errors backpropagate through the layers."  As the first l layers are selected to be frozen, and the remainder are trained in the second model when there are over n instances, it is determined based on the domain adaptation data richness. Page 55 states "However, since we are also interested in at what layer l of the network features are able to generalize, we transfer the features from the source to the target task, one layer at a time. AlexNet has eight layers in total." Therefore, the models are both AlexNet and the second model is a duplicate of the first model.)
a transfer learning unit that applies transfer learning to the layer in the second model targeted for training, by using the training data of the second domain. (Page 55 states "When we transfer the parameters to the target model, we keep them fixed. That is to say, we do not update the parameters by gradient descent. The remaining 8 − l layers of the network we randomly initialize and let the errors backpropagate through the layers." Page 58 states "Mean accuracy obtained after training on the target splits of Tiny-ImageNet where i in Mtargeti equals 500, 400, 300, 200, 100 and 50 and validating on Vtarget." Therefore, the selected 8 − l layers of the second model are trained.)

Regarding claim 2, the rejection of claim 1 is incorporated herein. Soekhoe teaches
wherein the learning layer determination unit ensures that the higher the domain adaptation data richness, the larger the number of layers targeted for training, and the lower the domain adaptation data richness, the smaller the number of layers targeted for training. (Page 50 states "For the first n instances of a new class, freeze the first l layers of the network. Once you have obtained more than n instances for new class, training can simply affect all layers. Obviously the values for n and l depend on the data and task at hand, in our experiments freezing the first 3 layers until 300 (Tiny-ImageNet) and respectively 900 (MiniPlaces2) instances per class gave the best results." Therefore, when there is more than n instances, more layers (all layers) are targeted for training, and when there are less than n instances, meaning that the domain adaptation is lower, the number of layers targeted for training is smaller, as the first l layers are frozen.)

Regarding claim 3, the rejection of claim 1 is incorporated herein. Soekhoe teaches
wherein the learning layer determination unit includes more of layers near an input layer as layers targeted for training, as the domain adaptation data richness becomes higher.  (Page 50 states "For the first n instances of a new class, freeze the first l layers of the network. Once you have obtained more than n instances for new class, training can simply affect all layers. Obviously the values for n and l depend on the data and task at hand, in our experiments freezing the first 3 layers until 300 (Tiny-ImageNet) and respectively 900 (MiniPlaces2) instances per class gave the best results." Therefore, when there is more than n instances, all layers including the input layers are targeted for training, meaning that more layers near an input layer are chosen.)

Regarding claim 4, the rejection of claim 1 is incorporated herein. Soekhoe teaches
wherein the learning layer determination unit determines only full-connected layers to be layers targeted for training when the domain adaptation data richness is equal to or lower than a predetermined value. (Page 50 states "For the first n instances of a new class, freeze the first l layers of the network. Once you have obtained more than n instances for new class, training can simply affect all layers. Obviously the values for n and l depend on the data and task at hand, in our experiments freezing the first 3 layers until 300 (Tiny-ImageNet) and respectively 900 (MiniPlaces2) instances per class gave the best results." Soekhoe therefore teaches changing the values of n and l. Page 53 states "The model consists of five convolutional layers and three fully connected layers." Fig. 1 shows that the last three layers are the dense (fully-connected) layers. Therefore, when l = 5, the first five convolutional layers will be frozen and only the fully-connected layers will be targeted for training when the number of instances in the target domain, interpreted as the domain adaptation data richness, is equal to or lower to n.)
 
Regarding claim 5, Soekhoe teaches
A machine learning method comprising: (Page 51 states "In this work we will expand the study by [17], and measure the effect of target data set size on the transferability of parameters in convolutional neural networks. Our main contribution is to quantify the extent to which features are able to generalise to the target data set when we systematically reduce its size. We will investigate this for each individual layer by evaluating the accuracy as a function of the data set size." The method taught by the paper is interpreted as the machine learning method.)
The remainder of claim 5 recites substantially similar subject matter to claim 1 and is rejected with the same rationale, mutatis mutandis.

Regarding claim 6, Soekhoe teaches
A non-transitory computer-readable recording medium having embodied thereon a machine learning program comprising computer-implemented modules including: (Page 55 states "To conduct our experiments, we use the Caffe deep learning framework developed at UC Berkeley [7]. We make use of a single Nvidida GTX Titan X graphics card to enable Caffe in GPU mode, to speed up our training time." Therefore, as a processor executes the method, the method must be implemented on a non-transitory computer-readable recording medium having embodied thereon a machine learning program comprising computer-implemented modules including the method.)
The remainder of claim 6 recites substantially similar subject matter to claim 1 and is rejected with the same rationale, mutatis mutandis.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JESSICA THUY PHAM whose telephone number is (571)272-2605. The examiner can normally be reached Monday - Friday, 9 A.M. - 5:00 P.M..
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li Zhen can be reached at (571) 272-3768. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/J.T.P./Examiner, Art Unit 2121                                                                                                                                                                                                        


/Li B. Zhen/Supervisory Patent Examiner, Art Unit 2121
Read full office action
MACHINE LEARNING DEVICE, MACHINE LEARNING METHOD, AND NON-TRANSITORY COMPUTER-READABLE RECORDING MEDIUM HAVING EMBODIED THEREON A MACHINE LEARNING PROGRAM

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

MACHINE LEARNING DEVICE, MACHINE LEARNING METHOD, AND NON-TRANSITORY COMPUTER-READABLE RECORDING MEDIUM HAVING EMBODIED THEREON A MACHINE LEARNING PROGRAM

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email