Detailed Action
This Office Action is in response to the remarks entered on 06/24/2025. Claims 7-14 and 18-20 have been cancelled. Claims 1-6 and 15-17 are currently pending.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Objections
Claim 1 is objected to because of the following informalities: “… the second neural network of the plurality of neural network” in line 14 should be “… the second neural network of the plurality of neural networks.”
All claims dependent on a claim objected to hereunder are also objected to for being dependent on an objected-to base claim.
Appropriate correction is required.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-6 and 15-17 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Regarding claim 1,
Step 1: Claim 1 recites a method of processing sensor-originated data using a computing device having at least one processor. Therefore, it is directed to the statutory category of processes.
2A Prong 1:
estimating a processor usage of the at least one processor to perform one of facial expression recognition, gesture recognition, and image segmentation; (a mental process of evaluation, as it merely recites predicting the usage amount of processor which can be done in a human mind, and facial expression recognition, gesture recognition, and image segmentation do not require a computer component and can be done in one’s mind)
comparing the estimated processor usage of the at least one processor to process the sensor-originated data using the first neural network and at least the second neural network of the plurality of neural networks (mental process of evaluation, as it merely recites comparing the estimated processor usage value of the first neural network and the estimated processor usage value of the second neural network, which can be done in human mind)
when the estimated processor usage of the at least one processor to process the sensor-originated data using the first neural network is less than the estimated processor usage of at least the second neural network of the plurality of neural network (mental process of evaluation, as it merely recites comparing the estimated processor usage value of the first neural network and the estimated processor usage value of the second neural network, which can be done in human mind)
2A Prong 2:
A method of processing sensor-originated data, comprising RGB image data representative of an image, using a computing device having at least one processor, the sensor-originated data representative of one or more physical quantities measured by one or more sensors, and the method comprising: (mere instructions to apply the exception using a generic computer component MPEP 2106.05(f))
to process the sensor-originated data using the first neural network and at least the second neural network of the plurality of neural networks, wherein the first and second neural network are configured (mere instructions to apply an exception using a computer MPEP 2106.05(f), because it merely recites utilizing generic neural network model to generate an output which is an applying it to perform a mental process)
processing, by said at least one processor of the computing device, the sensor-originated data using at least the first neural network, wherein: (The computing device having at least one processor in the claim is recited at a high-level of generality such that it amounts no more than mere instructions to apply the exception using a generic computer component MPEP 2106.05(f))
each of the first and second neural networks is configured to generate output data of the same type; (mere instructions to apply an exception using a computer MPEP 2106.05(f))
the first neural network is configured to receive a first set of input data types, MPEP 2106.05(g) of gathering statistics)
the first set of input data types including the RGB image data (is field of use and technological environment MPEP 2106.05(h))
the second set of input data types including the RGB image data and further including depth data representative of a depth in an environment, wherein the depth data is not included in the first set of data types (field of use and technological environment MPEP 2106.05(h), as it merely provides description what the data is, which does not provide any improvement to the invention)
the sensor-originated data representative of one or more physical quantities measured by one or more sensors (field of use and technological environment MPEP 2106.05(h), as it merely provides description what the data is, which does not provide any improvement to the invention)
obtaining, during processing of the sensor-originated data by said at least one processor of the computing device, an indication that the depth data is available for processing due to use of an intermediate neural network that generates depth data based on the RGB data; (amounts to insignificant extra-solution activity, which is mere data gathering MPEP 2106.05(g))
based on the indication, switching, by said at least one processor of the computing device, subsequent processing of sensor-originated data from using the first neural network to using the second neural network (mere instructions to perform the facial expression recognition using a computer MPEP 2106.05(f))
The additional elements as disclosed above alone or in combination do not integrate the judicial exception into practical application as they are mere insignificant extra solution activity, combination of generic computer functions that are restricted to field of use are implemented to perform the disclosed abstract idea above.
2B:
A method of processing sensor-originated data, comprising RGB image data representative of an image, using a computing device having at least one processor, the sensor-originated data representative of one or more physical quantities measured by one or more sensors, and the method comprising: (mere instructions to apply the exception using a generic computer component MPEP 2106.05(f))
to process the sensor-originated data using the first neural network and at least the second neural network of the plurality of neural networks, wherein the first and second neural network are configured to perform one of facial expression recognition, gesture recognition, and image segmentation (mere instructions to apply an exception using a computer MPEP 2106.05(f), because it merely recites utilizing generic neural network model to generate an output which is an applying it to perform a mental process)
processing, by said at least one processor of the computing device, the sensor-originated data using at least the first neural network, wherein: (The computing device having at least one processor in the claim is recited at a high-level of generality such that it amounts no more than mere instructions to apply the exception using a generic computer component MPEP 2106.05(f))
each of the first and second neural networks is configured to generate output data of the same type; (mere instructions to apply an exception using a computer MPEP 2106.05(f))
the first neural network is configured to receive a first set of input data types, MPEP 2106.05(g) in Step 2A Prong 2, therefore re-evaluated as well understood, routine, and conventional activity MPEP 2106.05(d)(II)(iv) of gathering statistics OIP Techs., 788 F.3d at 1362-63, 115 USPQ2d at 1092-93)
the first set of input data types including the RGB image data (field of use and technological environment MPEP 2106.05(h)).
second set of input data types including the RGB image data and further including depth data representative of a depth in an environment, wherein the depth data is not included in the first set of data types; (field of use and technological environment MPEP 2106.05(h), as it merely provides description what the data is, which does not provide any improvement to the invention)
the sensor-originated data representative of one or more physical quantities measured by one or more sensors (field of use and technological environment MPEP 2106.05(h), as it merely provides description what the data is, which does not provide any improvement to the invention)
obtaining, during processing of the sensor-originated data by said at least one processor of the computing device, an indication that the depth data is available for processing due to use of an intermediate neural network that generates depth data based on the RGB data; (indicated as an insignificant extra-solution activity in Step 2A Prong 2. Therefore, the limitation is re-evaluated in Step 2B as well understood routine and conventional activity MPEP 2106.05(d)(II)(iv) of gathering statistics OIP Techs., 788 F.3d at 1362-63, 115 USPQ2d at 1092-93).
based on the indication, switching, by said at least one processor of the computing device, subsequent processing of sensor-originated data from using the first neural network to using the second neural network (mere instructions to perform the facial expression recognition using a computer MPEP 2106.05(f))
The additional elements as disclosed above in combination of the abstract idea are not sufficient to amount to significantly more than the judicial exception as they are well, understood, routine and conventional activity as disclosed in combination of generic computer functions and usage of elements that are restricted to field of use that are implemented to perform the disclosed abstract idea above.
Regarding claim 2,
Step 1: Processes, as above.
2A Prong 1: Incorporates the rejection of claim 1.
2A Prong 2: wherein processing the sensor-originated data comprises processing the sensor-originated data using a set of neural networks comprising the first neural network (field of use and technological environment MPEP 2106.05(h), as it merely discloses usage of generic neural network with specific data to process input data and generate output which is restricted to a field of using a generic neural network)
2B: wherein processing the sensor-originated data comprises processing the sensor-originated data using a set of neural networks comprising the first neural network (a field of use and technological environment MPEP 2106.05(h), as it merely discloses usage of generic neural network with specific data to process input data and generate output which is restricted to a field of using a generic neural network)
Regarding claim 3,
Step 1: Processes, as above.
2A Prong 1: Incorporates the rejection of claim 2.
2A Prong 2: wherein the set of neural networks comprises a plurality of neural networks connected such that an output of one neural network in the set forms an input for another neural network in the set (amounts to field of use and technological environment MPEP 2106.05(h), as it merely defines the structure of the neural network which is restricted to a field of using a generic neural network)
2B: wherein the set of neural networks comprises a plurality of neural networks connected such that an output of one neural network in the set forms an input for another neural network in the set (amounts to field of use and technological environment MPEP 2106.05(h), as it merely defines the structure of the neural network which is restricted to a field of using a generic neural network)
Regarding claim 4,
Step 1: Processes, as above.
2A Prong 1: Incorporates the rejection of claim 2.
2A Prong 2: wherein the set of neural networks comprises a sequence of neural networks including the first neural network (a field of use or technological environment MPEP 2106.05(h), as it merely defines the structure of the neural network which is restricted to a field of using a generic neural network)
2B: wherein the set of neural networks comprises a sequence of neural networks including the first neural network (a field of use or technological environment MPEP 2106.05(h), as it merely defines the structure of the neural network which is restricted to a field of using a generic neural network)
Regarding claim 5,
Step 1: Processes, as above.
2A Prong 1: Incorporates the rejection of claim 1.
2A Prong 2: wherein the sensor-originated data comprises at least one of: image data representative of an image; audio data representative of a sound; (a field of use or technological environment MPEP 2106.05(h), as it merely provides description what the data is, which does not provide any improvement to the invention)
2B: wherein the sensor-originated data comprises at least one of: image data representative of an image; audio data representative of a sound; (a field of use or technological environment MPEP 2106.05(h), as it merely provides description what the data is, which does not provide any improvement to the invention)
Regarding claim 6,
Step 1: Processes, as above.
2A Prong 1: Incorporates the rejection of claim 5.
2A Prong 2: wherein at least one of: the image data comprises image feature data representative of at least one feature of the image; and the audio data comprises audio feature data representative of at least one feature of the sound (a field of use or technological environment MPEP 2106.05(h), as it merely provides description what the data is, which does not provide any improvement to the invention)
2B: wherein at least one of: the image data comprises image feature data representative of at least one feature of the image; and the audio data comprises audio feature data representative of at least one feature of the sound (a field of use or technological environment MPEP 2106.05(h), as it merely provides description what the data is, which does not provide any improvement to the invention)
Regarding claim 15,
Step 1: Claim 15 recites A computing device comprising: at least one processor; storage accessible by the at least one processor, the storage configured to store sensor- originated data. Therefore, it is directed to the statutory category of a machine.
2A Prong 1:
to perform one of facial expression recognition, gesture recognition, and image segmentation (a mental process of evaluation - facial expression recognition, gesture recognition, and image segmentation do not require a computer component and can be done in one’s mind)
estimate a processor usage of the at least one processor to process the sensor-originated data using the first neural network and using at least the second neural network of a plurality of neural networks (a mental process of evaluation, as estimating processor usage based on data can be performed in human mind)
compare the estimated processor usage of the at least one processor to process the sensor-originated data using the first neural network and at least the second neural network of the plurality of neural networks (a mental process of evaluation, as comparing estimated data can be performed in human mind)
when the estimated processor usage of the at least one processor to process the sensor-originated data using the first neural network is less than the estimated processor usage of at least the second neural network of the plurality of neural network (mental process of evaluation, as it merely recites comparing the estimated processor usage value of the first neural network and the estimated processor usage value of the second neural network, which can be done in human mind)
2A Prong 2:
A computing device comprising: at least one processor; storage accessible by the at least one processor, the storage configured to store sensor- originated data, comprising RGB image data representative of an image, the sensor-originated data representative of one or more physical quantities measured by one or more sensors (mere instructions to apply the exception using a generic computer component MPEP 2106.05(f))
wherein the at least one processor is configured to implement a plurality of neural networks including a first neural network and a second neural network configured to generate output data of the same type, wherein the first and second neural network are configured to (mere instructions to apply an exception using a computer MPEP 2106.05(f))
wherein the first neural network is configured to receive a first set of input data types, (insignificant extra-solution activity MPEP 2106.05(g) of gathering statistics) the first set of input data types including the RGB image data, (a field of use and technological environment MPEP 2106.05(h)). and the second neural network is configured to receive a second set of input data types, (an insignificant extra-solution activity MPEP 2106.05(g) of gathering statistics) the second set of input data types including the RGB image data and further including depth data representative of a depth in an environment, wherein the depth data is not included in the first set of input data types; (a field of use and technological environment MPEP 2106.05(h))
a controller configured to: estimate; compare; (are mere instructions to apply an exception using a computer MPEP 2106.05(f))
obtain, during processing of the sensor originated data, an indication that the depth data due to use of an intermediate neural network that generates depth data based on the RGB data is available for processing; (insignificant extra-solution activity, and amounts to gathering statistics MPEP 2106.05(g))
based on the indication, switching subsequent processing of sensor-originated data from using the first neural network to using the second neural network (mere instructions to apply an exception using a computer MPEP 2106.05(f))
The additional elements as disclosed above alone or in combination do not integrate the judicial exception into practical application as they are mere insignificant extra solution activity, combination of generic computer functions that are restricted to field of use are implemented to perform the disclosed abstract idea above.
2B:
A computing device comprising: at least one processor; storage accessible by the at least one processor, the storage configured to store sensor- originated data, comprising RGB image data representative of an image, the sensor-originated data representative of one or more physical quantities measured by one or more sensors (mere instructions to apply the exception using a generic computer component MPEP 2106.05(f))
wherein the at least one processor is configured to implement a plurality of neural networks including a first neural network and a second neural network configured to generate output data of the same type, wherein the first and second neural network are configured to (mere instructions to apply an exception using a computer MPEP 2106.05(f))
wherein the first neural network is configured to receive a first set of input data types, (indicated as an insignificant extra-solution activity MPEP 2106.05(g). Therefore, the limitation is re-evaluated as well understood, routine, and conventional activity of gathering statistics MPEP 2106.05(d)(II)(iv) OIP Techs., 788 F.3d at 1362-63, 115 USPQ2d at 1092-93) the first set of input data types including the RGB image data, (a field of use and technological environment MPEP 2106.05(h)) and the second neural network is configured to receive a second set of input data types, (indicated as an insignificant extra-solution activity MPEP 2106.05(g). Therefore, the limitation is re-evaluated as well understood, routine, and conventional activity of gathering statistics MPEP 2106.05(d)(II)(iv) OIP Techs., 788 F.3d at 1362-63, 115 USPQ2d at 1092-93) the second set of input data types including the RGB image data and further including depth data representative of a depth in an environment, wherein the depth data is not included in the first set of input data types; (a field of use and technological environment MPEP 2106.05(h))
a controller configured to: estimate; compare; (mere instructions to apply an exception using a computer MPEP 2106.05(f))
obtain, during processing of the sensor originated data, an indication that the depth data due to use of an intermediate neural network that generates depth data based on the RGB data is available for processing; (indicated as an insignificant extra-solution activity in Step 2A Prong 2. Therefore, the limitation is re-evaluated as well understood, routine, and conventional activity of gathering statistics MPEP 2106.05(d)(II)(iv) OIP Techs., 788 F.3d at 1362-63, 115 USPQ2d at 1092-93).
based on the indication, switching subsequent processing of sensor-originated data from using the first neural network to using the second neural network (mere instructions to apply an exception using a computer MPEP 2106.05(f))
The additional elements as disclosed above in combination of the abstract idea are not sufficient to amount to significantly more than the judicial exception as they are well, understood, routine and conventional activity as disclosed in combination of generic computer functions and usage of elements that are restricted to field of use that are implemented to perform the disclosed abstract idea above.
Regarding claim 16,
Step 1: A machine, as above.
2A Prong 1: Incorporates the rejection of claim 15.
2A Prong 2: the controller (mere instructions to apply an exception using a generic computer component MPEP 2106.05(f))
configured to process the sensor-originated data using a set of neural networks comprising the first neural network (a field of use and technological environment MPEP 2106.05(h))
2B: configured to process the sensor-originated data using a set of neural networks comprising the first neural network (a field of use and technological environment MPEP 2106.05(h))
Regarding claim 17,
Step 1: A machine, as above.
2A Prong 1: Incorporates the rejection of claim 16.
2A Prong 2: computing device (mere instructions to apply an exception using a generic computer component MPEP 2106.05(f))
wherein the set of neural networks comprises a sequence of neural networks including the first neural network (a field of use or technological environment MPEP 2106.05(h))
2B: computing device (mere instructions to apply an exception using a generic computer component MPEP 2106.05(f))
wherein the set of neural networks comprises a sequence of neural networks including the first neural network (a field of use or technological environment MPEP 2106.05(h))
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-6 and 15-17 are rejected under 35 U.S.C. 103 as being unpatentable over Tseng (US 20200302292 A1) in view of Yang (US 20190347823 A1, hereinafter ‘Yang’) and further in view of Truong (US 10452700 B1, hereinafter ‘Truong’).
Regarding claim 1, Tseng teaches:
A method of processing sensor-originated data, comprising [Tseng, 0103] “The input data may be collected by one or more sensors provided in, or in association with, the user device 202 which implements the neural network. Such sensors may include but are not limited to image sensors, light sensors, microphones, electrodes, gyroscopes and accelerometers.” Light, sound, electrodes, and acceleration are physical quantities. [Tseng, 0023] teaches the system comprises at least one processor that stores and executes the neural network processes.):
estimating a processor usage of the at least one processor to process the sensor-originated data using a first neural network and using at least a second neural network of a plurality of neural networks, ([Tseng, 0016] The computational resource use by the neural network may comprise one or any combination of: CPU usage when executing the neural network, latency associated with the neural network, and memory usage. [Tseng, 0059; Figure 3] In operation S3-2, the user device 202, for example by utilizing the specific application, may select a neural network model from a plurality of neural network models locally stored on the device 202 … This selection may be based on the computational resource constraints of the device 202. [Tseng, 0061] The estimate of accuracy and/or computational resource use of the selected model which comprises the first or the second model is determined in S3-3);
comparing the estimated processor usage of the at least one processor to process the sensor-originated data using the first neural network and at least the second neural network of the plurality of neural networks ([Tseng, 0059; Figure 3] In operation S3-2, the user device 202, for example by utilizing the specific application, may select a neural network model from a plurality of neural network models locally stored on the device 202 … This selection may be based on the computational resource constraints of the device 202. [Tseng, 0016] The computational resource use by the neural network may comprise one or any combination of: CPU usage when executing the neural network, latency associated with the neural network, and memory usage. [Tseng, 0061] The estimate of accuracy and/or computational resource use of the selected model which comprises the first or the second model is determined in S3-3. The indication of computational resources that may be used may also be selected based on the performance information and/or may be generated on-the-fly, which corresponds to the comparing process);
when the estimated processor usage of the at least one processor to process the sensor- originated data using the first neural network is less than the estimated processor usage of at least the second neural network of the plurality of neural network, processing, by said at least one processor of the computing device, the sensor-originated data using at least the selected neural network, wherein: each of the first and second neural networks is configured to generate output data [Tseng, 0059; Figure 3] In operation S3-2, the user device 202, for example by utilizing the specific application, may select a neural network model from a plurality of neural network models locally stored on the device 202 … This selection may be based on the computational resource constraints of the device 202. [Tseng, 0016] The computational resource use by the neural network may comprise one or any combination of: CPU usage when executing the neural network, latency associated with the neural network, and memory usage.
[Tseng, 0093 and 0095] “Subsequently, in operation S4-6, the apparatus 200 determines whether each of the trained neural networks satisfies the one or more imposed constraints (e.g. those received/determined in operation S4-1), e.g. comparing the monitored computational resource use with the one or more imposed constraints. For instance, the apparatus 200 may determine whether the accuracy of the trained neural network satisfies the minimum acceptable accuracy constraint and/or whether the computational resource usage of the trained neural network satisfies the computational resource constraint.” The output data of the neural network is generated by processing the validation data through each of the trained neural networks. Since there are plurality of neural networks, there are at least a first neural network and a second neural network);
Tseng does not specifically disclose:
wherein the first and second neural network are configured to perform one of facial expression recognition, gesture recognition, and image segmentation
processing, by said at least one processor of the computing device, the sensor-originated data using at least the selected neural network, wherein: each of the first and second neural networks is configured to generate output data of the same type; and
the first neural network is configured to receive a first set of input data types, the first set of input data types including the RGB image data, and the second neural network is configured to receive a second set of input data types, the second set of input data types including the RGB image data and further including depth data representative of a depth in an environment, wherein the depth data is not included in the first set of input data types;
obtaining, during processing of the sensor-originated data by said at least one processor of the computing device, an indication that the depth data is available for processing due to use of an intermediate neural network that generates depth data based on the RGB data; and
based on the indication, switching, by at least one processor of the computing device, subsequent processing of sensor-originated data from using the first neural network to using the second neural network.
Yang teaches:
wherein the first and second neural network are configured to perform one of facial expression recognition, gesture recognition, and image segmentation ([Yang, 0003], [0081] and [0083] collectively discloses that the first neural network and the second neural network performs facial recognition and the target object is a face. [Yang, Fig, 4C; 0041-0042 and 0145-0146] The first neural network branch and the second neural network branch 434 process the target image data and generate the same type of output data 435 and 436. [Yang, 0064] further discloses performing living body detection)
processing, by said at least one processor of the computing device, the sensor-originated data using at least the selected neural network, wherein: each of the first and second neural networks is configured to generate output data of the same type; and ([Yang, 0005] discloses that the target images are sensed by a first sensor and a second sensor. [Yang, Fig, 4C; 0145-0146] The first neural network branch and the second neural network branch 434 generates same type of output data 435 and 436)
the first neural network is configured to receive a first set of input data types, the first set of input data types including the RGB image data, and the second neural network is configured to receive a second set of input data types, the second set of input data types including the RGB image data and further including depth data representative of a depth in an environment, wherein the depth data is not included in the first set of input data types; ([Yang, Fig, 4C; 0041-0042 and 0145-0146] The first neural network branch and the second neural network branch 434 generates same type of output data 435 and 436. Both neural networks receive face key point information 433 obtained after the data preprocessing module (RGB image data), and the second neural network receives a face near-infrared map 432 and the first neural network receives face depth map 431. The face depth map 431 is not included in the input dataset input to the second neural network. [Yang, 0033 and 0085] discloses that the key point detection is performed on the target image and target image detected by sensors is RGB image)
obtaining, during processing of the sensor-originated data by said at least one processor of the computing device, an indication that the depth data is available for processing due to use of ; and ([Yang, Fig, 4C; 0041-0042 and 0145-0146] The first neural network branch and the second neural network branch 434 generates same type of output data 435 and 436. Both neural networks receive face key point information 433 obtained after the data preprocessing module (RGB image data), and the second neural network receives a face near-infrared map 432 and the first neural network receives face depth map 431. [Yang, 0083] discloses acquiring depth information (indication that the depth data is available) using the first sensor different from the second sensor)
Before the effective filing date of the invention to a person of ordinary skill in the art, it would have been obvious, having the teachings of Tseng and Yang to use the method of obtaining two different output features by processing depth information and the RGB data using the first neural network and processing the RGB data using the second neural network of Yang to implement the machine learning system of Tseng. The suggestion and/or motivation to do so is to allow the machine learning model to produce results even if the input data is incomplete and allow the machine learning model to produce more accurate result when the depth data is available [Yang, 0065].
However, Tseng in view of Yang does not specifically disclose:
obtaining, during processing of the sensor-originated data by said at least one processor of the computing device, an indication that the depth data is available for processing due to use of an intermediate neural network that generates depth data based on the RGB data; and
based on the indication, switching, by at least one processor of the computing device, subsequent processing of sensor-originated data from using the first neural network to using the second neural network.
Truong teaches:
obtaining, during processing of the sensor-originated data by said at least one processor of the computing device, an indication that the due to use of an intermediate neural network that generates ; and ([Truong, Fig. 7; col 17, line 4 – col 18, line 25] Block 704 and 706 discloses applying a classifier neural network (intermediate neural network) to determine the data type of the unstructured data to select a corresponding neural network, and block 706 and 708 discloses selecting a neural network (the first neural network and second neural network) and inputting the data to the selected neural network. The output of the classifier (type) is the indication of the data type is available for processing. [Truong, col 19, line 3-5] discloses that the classifier may be a character-level convolutional neural network)
based on the indication, switching, by at least one processor of the computing device, subsequent processing of from using the first neural network to using the second neural network. ([Truong, Fig. 7; col 17, line 4 – col 18, line 25] Block 704 and 706 discloses applying a classifier neural network (intermediate neural network) to identify a type of the unstructured data to select a corresponding neural network, and block 706 and 708 discloses selecting a neural network (the first neural network and second neural network) and inputting the data to the selected neural network. [Truong, col 19, line 3-5] discloses that the classifier may be a character-level convolutional neural network)
Before the effective filing date of the invention to a person of ordinary skill in the art, it would have been obvious, having the teachings of Tseng, Yang and Truong to use the method of selecting a neural network from a set of neural networks based on availability of a specific data type of Truong to implement the machine learning system of Tseng. The suggestion and/or motivation to do so is to improve the accuracy and efficiency of the machine learning method by selecting an appropriately trained neural network among a set of neural networks ([Truong, col 7, line 20-29] and [Truong, col 1, line 60-64]).
Regarding claim 2, Tseng in view of Yang and further in view of Truong teaches:
wherein processing the sensor-originated data comprises processing the sensor-originated data using a set of neural networks comprising the selected neural network. ([Tseng, 0016] The computational resource use by the neural network may comprise one or any combination of: CPU usage when executing the neural network, latency associated with the neural network, and memory usage. [Tseng, 0059; Figure 3] In operation S3-2, the user device 202, for example by utilizing the specific application, may select a neural network model from a plurality of neural network models locally stored on the device 202 … This selection may be based on the computational resource constraints of the device 202. [Tseng, 0061] The estimate of accuracy and/or computational resource use of the selected model which comprises the first or the second model is determined in S3-3)
Regarding claim 3, Tseng in view of Yang and further in view of Truong teaches:
wherein the set of neural networks comprises a plurality of neural networks connected such that an output of one neural network in the set forms an input for another neural network in the set ([Truong, Fig. 7; col 17, line 4 – col 18, line 25] Block 704 and 706 discloses applying a classifier neural network (intermediate neural network) to identify a type of the unstructured data to select a corresponding neural network, and block 706 and 708 discloses selecting a neural network (the first neural network and second neural network) and inputting the data to the selected neural network. [Truong, col 19, line 3-5] discloses that the classifier may be a character-level convolutional neural network).
Regarding claim 4, Tseng in view of Yang and further in view of Truong teaches:
wherein the set of neural networks comprises a sequence of neural networks including the selected neural network ([Truong, Fig. 7; col 17, line 4 – col 18, line 25] Block 704 and 706 discloses applying a classifier neural network (intermediate neural network) to identify a type of the unstructured data to select a corresponding neural network, and block 706 and 708 discloses selecting a neural network (the first neural network and second neural network) and inputting the data to the selected neural network. The connection between the classifier and the selected neural network is interpreted as the sequence of neural networks. [Truong, col 19, line 3-5] discloses that the classifier may be a character-level convolutional neural network).
Regarding claim 5, Tseng teaches:
wherein the sensor-originated data comprises at least one of: image data representative of an image; audio data representative of a sound; ([Tseng, 0103] “The input data may be collected by one or more sensors provided in, or in association with, the user device 202 which implements the neural network. Such sensors may include but are not limited to image sensors, light sensors, microphones, electrodes, gyroscopes and accelerometers.” Light, sound, electrodes, and acceleration are physical quantities.”).
Regarding claim 6, Tseng teaches:
wherein at least one of: the image data comprises image feature data representative of at least one feature of the image; and the audio data comprises audio feature data representative of at least one feature of the sound ([Tseng, 0103] The input data may be collected by one or more sensors provided in, or in association with, the user device 202 which implements the neural network. Such sensors may include but are not limited to image sensors, light sensors, microphones, electrodes, gyroscopes and accelerometers.” Light, sound, electrodes, and acceleration are physical quantities. Microphones can capture at least one of frequency, volume, pitch … which are features included in the sound. Image sensors can capture at least one of color, shape, shadow … which are features included in the image data).
Regarding claim 15, Tseng teaches:
A computing device comprising: at least one processor; storage accessible by the at least one processor, the storage configured to store sensor- originated data, comprising RGB image data representative of an image, the sensor-originated data representative of one or more physical quantities measured by one or more sensors ([Tseng, 0103] “Such sensors may include but are not limited to image sensors, light sensors, microphones, electrodes, gyroscopes and accelerometers.” Light, sound, electrodes, and acceleration are physical quantities. [Tseng, 0023] teaches the system comprises at least one processor that stores and executes the neural network processes. [Tseng, 0016] The computational resource use by the neural network may comprise one or any combination of: CPU usage when executing the neural network, latency associated with the neural network, and memory usage.);
a controller configured to: estimate a processor usage of the at least one processor to process the sensor- originate data using the first neural network and using at least the second neural network of a plurality of neural networks ([Tseng, 0059; Figure 3] In operation S3-2, the user device 202, for example by utilizing the specific application, may select a neural network model from a plurality of neural network models locally stored on the device 202 … This selection may be based on the computational resource constraints of the device 202. [Tseng, 0061] The estimate of accuracy and/or computational resource use of the selected model which comprises the first or the second model is determined in S3-3. [Tseng, 0016] The computational resource use by the neural network may comprise one or any combination of: CPU usage when executing the neural network, latency associated with the neural network, and memory usage. );
compare the estimated processor usage of the at least one processor to process the sensor-originated data using the first neural network and at least the second neural network of the plurality of neural networks ([Tseng, 0059; Figure 3] In operation S3-2, the user device 202, for example by utilizing the specific application, may select a neural network model from a plurality of neural network models locally stored on the device 202 … This selection may be based on the computational resource constraints of the device 202. [Tseng, 0016] The computational resource use by the neural network may comprise one or any combination of: CPU usage when executing the neural network, latency associated with the neural network, and memory usage.
[Tseng, 0061] The estimate of accuracy and/or computational resource use of the selected model which comprises the first or the second model is determined in S3-3. The indication of computational resources that may be used may also be selected based on the performance information and/or may be generated on-the-fly, which corresponds to the comparing process. );
when the estimated processor usage of the at least one processor to process the sensor originated data using the first neural network is less than the estimated processor usage of at least the second neural network of the plurality of neural networks, process the sensor-originated data using at least the first neural network ([Tseng, 0103] “The input data may be collected by one or more sensors provided in, or in association with, the user device 202 which implements the neural network. Such sensors may include but are not limited to image sensors, light sensors, microphones, electrodes, gyroscopes and accelerometers.” ‘The input data’ implies that the data will be processed by the neural network. [Tseng, 0016] The computational resource use by the neural network may comprise one or any combination of: CPU usage when executing the neural network, latency associated with the neural network, and memory usage.
[Tseng, 0059] “In operation S3-2, the user device 202, for example by utilizing the specific application, may select a neural network model from a plurality of neural network models locally stored on the device 202.” The paragraph discloses the user device select a neural network.).
Tseng does not specifically disclose:
wherein the at least one processor is configured to implement a plurality of neural networks including a first neural network and a second neural network configured to generate output data of the same type, wherein the first and second neural network are configured to perform one of facial expression recognition, gesture recognition, and image segmentation,
wherein the first neural network is configured to receive a first set of input data types, the first set of input data types including the RGB image data, and the second neural network is configured to receive a second set of input data types, the second set of input data types including the RGB image data and further including depth data representative of a depth in an environment, wherein the depth data is not included in the first set of input data types;
obtain, during processing of the sensor-originated data, an indication that the depth data due to use of an intermediate neural network that generates depth data based on the RGB data is available for processing; and
based on the indication, switching subsequent processing of sensor-originated data from using the first neural network to using the second neural network.
Yang teaches:
wherein the at least one processor is configured to implement a plurality of neural networks including a first neural network and a second neural network configured to generate output data of the same type, wherein the first and second neural network are configured to perform one of facial expression recognition, gesture recognition, and image segmentation, ([Yang, Fig, 4C; 0041-0042 and 0145-0146] The first neural network branch and the second neural network branch 434 generates same type of output data 435 and 436. [Yang, 0003], [0081] and [0083] collectively discloses that the first neural network and the second neural network performs facial recognition and the target object is a face)
wherein the first neural network is configured to receive a first set of input data types, the first set of input data types including the RGB image data, and the second neural network is configured to receive a second set of input data types, the second set of input data types including the RGB image data and further including depth data representative of a depth in an environment, wherein the depth data is not included in the first set of input data types; ([Yang, Fig, 4C; 0041-0042 and 0145-0146] The first neural network branch and the second neural network branch 434 generates same type of output data 435 and 436. Both neural networks receive face key point information 433 obtained after the data preprocessing module (RGB image data), and the second neural network receives a face near-infrared map 432 and the first neural network receives face depth map 431. The face depth map 431 is not included in the input dataset input to the second neural network. [Yang, 0033 and 0085] discloses that the key point detection is performed on the target image and target image detected by sensors is RGB image)
obtain, during processing of the sensor-originated data, an indication that the depth data due to use of an is available for processing; and ([Yang, Fig, 4C; 0041-0042 and 0145-0146] The first neural network branch and the second neural network branch 434 generates same type of output data 435 and 436. Both neural networks receive face key point information 433 obtained after the data preprocessing module (RGB image data), and the second neural network receives a face near-infrared map 432 and the first neural network receives face depth map 431. [Yang, 0083] discloses acquiring depth information (indication that the depth data is available) using the first sensor different from the second sensor)
Before the effective filing date of the invention to a person of ordinary skill in the art, it would have been obvious, having the teachings of Tseng and Yang to use the method of obtaining two different output features by processing depth information and the RGB data using the first neural network and processing the RGB data using the second neural network of Yang to implement the machine learning system of Tseng. The suggestion and/or motivation to do so is to allow the machine learning model to produce results even if the input data is incomplete and allow the machine learning model to produce more accurate result when the depth data is available [Yang, 0065].
However, Tseng in view of Yang does not specifically disclose:
obtain, during processing of the sensor-originated data, an indication that the depth data due to use of an intermediate neural network that generates depth data based on the RGB data is available for processing; and
based on the indication, switching subsequent processing of sensor-originated data from using the first neural network to using the second neural network.
Truong teaches:
obtain, during processing of the due to use of an intermediate neural network that generates is available for processing; and ([Truong, Fig. 7; col 17, line 4 – col 18, line 25] Block 704 and 706 discloses applying a classifier neural network (intermediate neural network) to determine the data type of the unstructured data to select a corresponding neural network, and block 706 and 708 discloses selecting a neural network (the first neural network and second neural network) and inputting the data to the selected neural network. The output of the classifier (type) is the indication of the data type is available for processing. [Truong, col 19, line 3-5] discloses that the classifier may be a character-level convolutional neural network)
based on the indication, switching subsequent processing of from using the first neural network to using the second neural network. ([Truong, Fig. 7; col 17, line 4 – col 18, line 25] Block 704 and 706 discloses applying a classifier neural network (intermediate neural network) to identify a type of the unstructured data to select a corresponding neural network, and block 706 and 708 discloses selecting a neural network (the first neural network and second neural network) and inputting the data to the selected neural network. [Truong, col 19, line 3-5] discloses that the classifier may be a character-level convolutional neural network)
Before the effective filing date of the invention to a person of ordinary skill in the art, it would have been obvious, having the teachings of Tseng, Yang and Truong to use the method of selecting a neural network from a set of neural networks based on availability of a specific data type of Truong to implement the machine learning system of Tseng. The suggestion and/or motivation to do so is to improve the accuracy and efficiency of the machine learning method by selecting an appropriately trained neural network among a set of neural networks ([Truong, col 7, line 20-29] and [Truong, col 1, line 60-64]).
Regarding claim 16, Tseng teaches:
wherein the controller is configured to process the sensor-originated data using a set of neural networks comprising the selected neural network ([Tseng, 0016] The computational resource use by the neural network may comprise one or any combination of: CPU usage when executing the neural network, latency associated with the neural network, and memory usage. [Tseng, 0059; Figure 3] In operation S3-2, the user device 202, for example by utilizing the specific application, may select a neural network model from a plurality of neural network models locally stored on the device 202 … This selection may be based on the computational resource constraints of the device 202. [Tseng, 0061] The estimate of accuracy and/or computational resource use of the selected model which comprises the first or the second model is determined in S3-3)
Regarding claim 17, Tseng in view of Yang and further in view of Truong teaches:
wherein the set of neural networks comprises a sequence of neural networks including the selected neural network. ([Truong, Fig. 7; col 17, line 4 – col 18, line 25] Block 704 and 706 discloses applying a classifier neural network (intermediate neural network) to identify a type of the unstructured data to select a corresponding neural network, and block 706 and 708 discloses selecting a neural network (the first neural network and second neural network) and inputting the data to the selected neural network. The connection between the classifier and the selected neural network is interpreted as the sequence of neural networks. [Truong, col 19, line 3-5] discloses that the classifier may be a character-level convolutional neural network)
Response to Arguments
Response to Arguments under 35 U.S.C. 101
Arguments: Applicant argues that claim 1 is now eligible because (a) amended and now limited to a computer-implemented method of processing sensor-originated data, wherein the method includes performing one of facial expression recognition, gesture recognition, and image segmentation, (b) the claimed method is directed to a specific technological improvement, namely providing an improved method for performing one of facial expression recognition, gesture recognition, and image segmentation, (c) Example 47 claim 3, applying neural networks for a specific task is identified as patent-eligible, and (d) the human mind is not capable of performing these techniques using first and second neural networks and switching based on estimated processor usage. [Remarks, page 7]
Examiner’s Answer: Examiner respectfully disagrees. Regarding (a), facial expression recognition, gesture recognition, and image segmentation does not require a computer component and the human mind is capable of recognizing one’s face and dividing images manually. The limitations “the first and second neural network” and “processing, by said at least one processor of the computing device, the sensor-originated data using at least the first neural network” are directed to mere instructions to perform facial expression recognition, gesture recognition, and image segmentation on generic computer components.
Regarding (b), MPEP 2106.05(a) states that “If it is asserted that the invention improves upon conventional functioning of a computer, or upon conventional technology or technological processes, a technical explanation as to how to implement the invention should be present in the specification. That is, the disclosure must provide sufficient details such that one of ordinary skill in the art would recognize the claimed invention as providing an improvement. The specification need not explicitly set forth the improvement, but it must describe the invention such that the improvement would be apparent to one of ordinary skill in the art. Conversely, if the specification explicitly sets forth an improvement but in a conclusory manner (i.e., a bare assertion of an improvement without the detail necessary to be apparent to a person of ordinary skill in the art), the examiner should not determine the claim improves technology.” Examiner performed this evaluation and concluded that Applicant’s purported improvement relied upon by Applicant does not have a sufficient nexus to the claim language itself to suggest to the ordinary artisan that the claims are to an improvement in technology. The independent claims themselves recite selecting a neural network from a set of neural networks based on a CPU utilization and availability of a specific data type. The first neural network and the second neural network and the selection process are invoked as a tool to perform the facial expression recognition and gesture recognition.
Regarding (c), Example 47 claim 3 is distinguishable from the instant application. First, as noted in (b) and the MPEP 2106.05(a), consideration of whether the claim as a whole includes an improvement to a computer or to a technological field requires an evaluation of the specification and the claim to ensure that a technical explanation of the asserted improvement is present in the specification, and that the claim reflects the asserted improvement. Examiner performed this evaluation and concluded that the claim as a whole does not include an improvement to a computer or to a technological field. Second, in Example 47 claim 3, the additional elements of “(d) detecting a source address associated with the one or more malicious network packets,” “(e) dropping the one or more malicious network packets,” and “(f) blocking future traffic from the source address” linked the abstract idea recited in limitation (b) and (c) to a particular field of use. However, no such nexus exists in the instant claims.
Regarding (d), examiner agrees that the human mind is not capable of switching the neural network based on estimated processor usage. However, switching (selecting) the neural network based on estimated processor usage is directed to mere instructions to perform facial expression recognition, gesture recognition, and image segmentation on generic computer components.
Accordingly, the arguments regarding claim 1 are not persuasive. Similarly, the arguments to claim 15 are not persuasive. Claims 2-13 and 16-20 depend from claims 1 and 15. Therefore, the arguments to claims 2-13 and 16-20 are not persuasive.
Response to Arguments under 35 U.S.C. 103
Applicant’s arguments with respect to claims 1-6 and 15-17 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for matter specifically challenged in the argument.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JUN KWON whose telephone number is (571)272-2072. The examiner can normally be reached M-F 7:30AM – 4:30PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Abdullah Kawsar can be reached on (571)270-3169. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/JUN KWON/Examiner, Art Unit 2127
/RYAN C VAUGHN/Primary Examiner, Art Unit 2125