DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-20 are presented for examination.
This is Non final
Priority
Acknowledgment is made of applicant’s claim for foreign priority under 35 U.S.C. 119 (a)-(d). The certified copy has been filed in parent Application No. 2022112953012 , filed on 10/21/2022.
Information Disclosure Statement
The information disclosure statement submitted on 11/11/2022 was failed. All documents are considered. See the attached file.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 4 and 14 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Claim 4 recites the limitation "the number of neurons" in line 23.
Claim 14 recites the limitation "the number of neurons" in line 24.
There is insufficient antecedent basis for this limitation in the claim.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-20 are rejected under 35 U.S.C 101 because the claim invention recites a judicial exception, which is directed to that judicial exception of an abstract idea, as it has not been integrated into practical application and the claim further do not recite significantly more than the judicial exception.
Step 1: Yes: claims 1 – 10 are directed to a method, which fall within the statutory category of process, while claims 11-20 are directed to a device, which is considered as a machine.
Step 2A: Prong One:
Claim 1, 5 - 8, 11, 15 -16 and 20 claims abstract ideas, abstract ideas are bolded.
Regarding claim 1, 11 and 20,
determining a target model for processing data (insignificant extra solution activity – data gathering)
dividing the target model into a plurality of modules that implement different tasks; (insignificant extra solution activity – data gathering)
determining a quantity of parameters of a target module in the plurality of modules and a size of transmission data related to the target module; (insignificant extra solution activity – data gathering)
determining an arrangement position of the target module based on the quantity and the size: under its broadest interpretation this limitation recites abstract idea under mental process. A person of ordinary skill in the art can make judgment on the arrangement of the module based on gathered information of quality and size. This "can be performed in the human mind, or by a human using a pen and paper". Mental processes include observations, evaluations, judgments, and opinions, and this claims are making a judgment based on observed or gathered information.
Regarding claim 5 - 6, and 15 -16:
determining a ratio of the size to the quantity; this is simple algebra which can be done by human mind using pen and paper, so under its broadest interpretation this claim also recites abstract idea under mental process. A person of ordinary skill in the art can determine the ration of size and quality.
As of claim 5- 6 and 15 -16 have a contingent claim limitations such that :arranging, if the ratio is greater than a threshold, the target module in an edge device: if the ratio is less than or equal to the threshold, the target module in a server under its broadest interpretation this claim also recites abstract idea under mental process, this limitation is comparing known values and based on this value make a judgment. A person of ordinary skill in the art can compare determined ratio with the threshold to decide the arrangement to be either on the edge or the server. A claim to collecting and comparing known information (claim 1), which are steps that can be practically performed in the human mind, Classen Immunotherapies, Inc. v. Biogen IDEC, 659 F.3d 1057, 1067, 100 USPQ2d 1492, 1500 (Fed. Cir. 2011);
Step 2A : Prong two:
The above judicially exceptions do not recite additional elements that integrate the exceptions into a practical application of the exception because the claims do not have additional elements of a combination of additional elements that apply, rely or use the judicial exception in a manner that impose a meaningful limit on the judicial exception.
Claims recites gathering data and data manipulation, which is insignificant extra solution activity. Adding insignificant extra-solution activity to the judicial exception, e.g., mere data gathering in conjunction with a law of nature or abstract idea such as a step of obtaining information about credit card transactions so that the information can be analyzed by an abstract mental process, as discussed in CyberSource V. Retail Decisions, Inc., 654 F.3d 1366, 1375, 99 USPQ2d 1690, 1694 (Fed. Cir. 2011) (see MPEP § 2106.05(g)). The following claims perform insignificant extra activity like data gathering, manipulation and further defining of abstract ideas.
Claims 1,11 and 20, determining a target model for processing data (insignificant extra solution activity – data gathering)
dividing the target model into a plurality of modules that implement different tasks; (insignificant extra solution activity – data gathering)
determining a quantity of parameters of a target module in the plurality of modules and a size of transmission data related to the target module; (insignificant extra solution activity – data gathering)
Claims 7 - 8 and 17 -18 also further defines abstract idea with out recite additional elements that integrate the exceptions into a practical application of the exception.
Step 2B: No, the claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception for the above claims. As discussed above with respect to integration of the abstract idea into a practical application, claims recited of additional element of “ electronic device comprising a processor and memory”, and “ computer program product" ( claims 11 -19 and 20). Which is merely used as a tool to perform the abstract ideas and using electronic does not make improvement to the functioning. Merely using a computer does not make an improvement to the functioning of the computer. Collecting information and using output for the rest of the claim is adding extra solution without integrating the abstract idea into practical application. Adding insignificant extra- solution activity to the judicial exception, e.g., mere data gathering in conjunction with a law of nature or abstract idea such as a step of obtaining information about credit card transactions so that the information can be analyzed by an abstract mental process, as discussed in CyberSource V. Retail Decisions, Inc., 654 F.3d 1366, 1375, 99 USPQ2d 1690, 1694 (Fed. Cir. 2011)
Claims 2-4, 9-10, 12-14 and 19 overcomes 101 by either by having additional elements that integrates abstract ideas into a practical application or they do not recite abstract idea, but this claims are dependent on claims which recites abstract ideas as it was analyzed above. Therefore claims 1-20 are not found eligible under 35 U.S.C 101 based on the above prong analysis.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-3 and 5-6 are rejected under 35 U.S.C. 103 as being unpatentable over G. Qu, H. Wu, R. Li and P. Jiao, "DMRO: A Deep Meta Reinforcement Learning-Based Task Offloading Framework for Edge-Cloud Computing," in IEEE Transactions on Network and Service Management, vol. 18, no. 3, pp. 3448-3459, Sept. 2021, in the view of S. Teerapittayanon, B. McDanel and H. T. Kung, "Distributed Deep Neural Networks Over the Cloud, the Edge and End Devices," 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS), Atlanta, GA, USA, 2017.
As of claim 1, Qu teaches, a method for model arrangement, comprising (section1, “Introduction”, Therefore, making the optimal decision is the most critical issue for edge offloading. It needs to dynamically decide whether the task should be offloaded to the edge server or cloud server.)
determining a quantity of parameters of a target module in the plurality of modules and a size of transmission data related to the target module(section 1, “introduction”, The process of task offloading is generally affected by a variety of factors in different areas, e.g., user preferences, wireless communication channels, network connection quality, mobility of IoT devices device and availability of edge/cloud servers. Therefore, making the optimal decision is the most critical issue for edge offloading. It needs to dynamically decide whether the task should be offloaded to the edge server or cloud server… Application Partition: Since different tasks usually have different amounts of computation and communication, before performing task offloading operation, it is better to divide the task into a workflow with multiple associated subtasks or as a series of independent subtasks, and then offload the subtasks separately. Among them, some subtasks are executed on the IoT devices, the others are executed on the relatively powerful server, making full use of the server resources, thereby greatly reducing the load of the IoT devices and improving their endurance .Resource Allocation: After the offloading decision is made, resources need to be allocated, including computing power, communication bandwidth, and energy consumption.)
determining an arrangement position of the target module based on the quantity and the size( section 1 “Introduction”, To tackle the above challenges, we design an edge-cloud offloading framework in this paper, where IoT devices can choose to shift their computing tasks either to edge servers or cloud servers. Edge servers make offloading decisions based on task information for each device, reducing latency and energy consumption).
Examiner note: As it was cited above offloading decisions are affected by different parameters and the tasks are divide by their amount of computation and communication. So, amount of computation is interpreted as size and other factors like amount of communication is interpreted as quality.
Qu does not explicitly determine a target model for processing data and
dividing the target model into a plurality of modules that implement different tasks.
While Teerapittayanon teaches determining a target model for processing data( section 1, “introduction” To address these concerns under the same optimization framework, it is desirable that a system could train a single end-to-end model, such as a DNN, and partition it between end devices and the cloud2 , in order to provide a simpler and more principled approach). Dividing the target model into a plurality of modules that implement different tasks(section 1, “Introduction”, An example of one such distributed approach is to combine a small NN1 model (less number of parameters) on end devices and a larger NN model (more number of parameters) in the cloud. The small model at an end device can quickly perform initial feature extraction, and also classification if the model is confident. Otherwise, the end device can fall back to the large NN model in the cloud, which performs further processing and final classification).
Qu and Teerapittayanon are considered to be analogous to the claim invention since they focus on model partition of edge cloud computing. Therefore it would be obvious to one of the ordinary skill in the art before the effective filing data to have applied Teerapittayanon teaching of determining the target model parameters of the plurality of DNN models which implement different tasks on Qu teach of using different parameters to offload models to edge or cloud to minimize latency and energy consumption.
The motivation would have been to a single DNN properly trained can be mapped onto a distributed computing hierarchy to meet the accuracy, communication and latency requirements of a target application while gaining inherent benefits associated with distributed computing such as fault tolerance and privacy(Teerapittayanon, “conclusion”)
As of claim 2, the combined model of Qu and Teerapittayanon teaches all the limitation of claim 1, and QU also teaches the target model comprises: determining a code of the target model; (Section B, “ Intelligent Offloading Decision-Making”, Besides, Neurosurgeon [30] was a fine-grained partitioning method that can find the optimal dividing point in DNNs according to different factors, and made full use of the resources of cloud servers and mobile devices to minimize the computational delays or energy consumption in IoT environments ).The different factors are considered as a code to find the optimal dividing points in DNNS for model arrangement. Determining a group of neural network models in the target model by analyzing the code ( Section B, “ Intelligent Offloading Decision-Making”, Deep learning methods refer to the classification of the input task information through the multi-layer neural network to determine the final offloading position. Huang et al. [28] provided an algorithm that adopted distributed deep learning to solve the offloading problem of mobile edge networks. It used parallel and distributed DNNs to produce offloading decisions and achieved good results.)
As of Claim 3, the combined model of Qu and Teerapittayanon teaches all the limitation of claim 2, and Teerapittayanon also teaches determining a task of each neural network model in the group of neural network models;(Section 1, “Introduction, An example of one such distributed approach is to combine a small NN1 model (less number of parameters) on end devices and a larger NN model (more number of parameters) in the cloud. The small model at an end device can quickly perform initial feature extraction, and also classification if the model is confident. Otherwise, the end device can fall back to the large NN model in the cloud, which performs further processing and final classification). determining neural network models that implement the same task as a module.( Section1, “introduction”, Multiple models at the cloud, the edge and the device need to be learned jointly to allow coordinated decision making. Computation already performed on end device models should be useful for further processing on edge or cloud models).
As of claim 5 and 6 the combined model of Qu and Teerapittayanon teaches all the limitation of claim 1, and Qu also teaches claim 5 and 6, as follow As of claim 5 and 6 contains a contingent limitations, Qu teaches claim 5 limitation of determining a ratio of the size to the quantity arranging, if the ratio is greater than a threshold, the target module in an edge device, and claim 6 limitation arranging, if the ratio is less than or equal to the threshold, the target module in a server (section A, “Inner Model”, As shown in Fig. 3, the inner model is based on a parallel Deep Reinforcement Learning (DRL) algorithm. We apply a classic reinforcement learning method named Q-learning, in which we input environmental parameters, labeled initial parameters and workflow x into the inner model. We use ai to represent the offloading decision of the i-th subtask of the workflow, which is defined as:
PNG
media_image1.png
103
634
media_image1.png
Greyscale
where ai = 0, 1, and 2 indicate that the i-th subtask is executed locally on the IoT device, the edge server, and the cloud server, respectively). Qu determine an offloading decision for each DNN with different method but the same result of arrange the tasks on edge or cloud server based on threshold.
Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable G. Qu, H. Wu, R. Li and P. Jiao, "DMRO: A Deep Meta Reinforcement Learning-Based Task Offloading Framework for Edge-Cloud Computing," in IEEE Transactions on Network and Service Management, vol. 18, no. 3, pp. 3448-3459, Sept. 2021, in the view of S. Teerapittayanon, B. McDanel and H. T. Kung, "Distributed Deep Neural Networks Over the Cloud, the Edge and End Devices," 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS), Atlanta, GA, USA, 2017, further in the view of Jose M. Alvarez and Mathieu Salzmann. 2016. Learning the number of neurons in deep networks. In Proceedings of the 30th International Conference on Neural Information Processing Systems (NIPS'16)
As of claim 4, the combined model of Qu and Teerapittayanon teaches all the limitation of claim 1, but they do not explicitly teach, determining the number of neurons in the target module and determining the quantity of the parameters based on the number of neurons.
While Alvarez teaches determining the number of neurons in the target module (Section 3, “Deep model selection: learning with structure sparsity”, We now introduce our approach to automatically determining the number of neurons in each layer of a deep network while learning the network parameters).
determining the quantity of the parameters based on the number of neurons.( Section 3, “Deep model selection: learning with structure sparsity”, A general deep network can be described as a succession of L layers performing linear operations on their input, intertwined with non-linearities, such as Rectified Linear Units (ReLU) or sigmoid, and, potentially, pooling operations. Each layer l consists of Nl neurons, each of which is encoded by parameters θnl = [wnl , bnl ], where wnl is a linear operator acting on the layer’s input and bnl is a bias. Altogether, these parameters form the parameter set Θ = {θl}1≤l≤L, with θl = {θ n l }1≤n≤Nl ).
Alvarez is considered to be analogous to the combined model of Qu and Teerapittayanon, and the claim invention because they use parameters to analyze models. Therefore it would be obvious to one of the ordinary skill in the art before the effective filing data to have applied Alvarez teaching of determining number of neurons as based on the number of neurons determining parameters on the target model of the combined model.
The motivation would have been to reduce redundant parameters and improves the accuracy of network by automatically determining the number of neurons, which will help to reduce the memory usage and computational cost in order to keep the model arrangement reasonable (Abstract, Alvarez).
Claim 7 is rejected under 35 U.S.C. 103 as being unpatentable G. Qu, H. Wu, R. Li and P. Jiao, "DMRO: A Deep Meta Reinforcement Learning-Based Task Offloading Framework for Edge-Cloud Computing," in IEEE Transactions on Network and Service Management, vol. 18, no. 3, pp. 3448-3459, Sept. 2021, in the view of S. Teerapittayanon, B. McDanel and H. T. Kung, "Distributed Deep Neural Networks Over the Cloud, the Edge and End Devices," 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS), Atlanta, GA, USA, 2017, in the view of Jose M. Alvarez and Mathieu Salzmann. 2016. Learning the number of neurons in deep networks. In Proceedings of the 30th International Conference on Neural Information Processing Systems (NIPS'16) further in the view of C. Zhang et al., "FACIAL: Synthesizing Dynamic Talking Face with Implicit Attribute Learning," 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 2021
As of claim 7, the combined model Qu, Teerapittayanon and Alvarez teaches all the limitations of claim 6, but they do not explicitly teach wherein the target model is a voice-based avatar generation model, the voice-based avatar generation model comprises an audio-based avatar parameter generation module, an avatar video-based avatar parameter generation module, an initial drawing module, and a fine-tuning drawing module.
While Zhang teaches the target model is a voice-based avatar generation model(Abstract, To model such complicated relationships among different face attributes with input audio, we propose a FACe Implicit Attribute Learning Generative Adversarial Network (FACIAL-GAN), which integrates the phoneticsaware, context-aware, and identity-aware information to synthesize the 3D face animation with realistic motions of lips, head poses, and eye blinks). The voice-based avatar generation model comprises an audio-based avatar parameter generation module, an avatar video-based avatar parameter generation module, an initial drawing module, and a fine-tuning drawing module(Section 2, “Related works”
PNG
media_image2.png
579
1243
media_image2.png
Greyscale
we introduce the FACIAL- GAN module to integrate phonetic, contextual, and personalized information of the talking, and combine the synthesized 3D model with AU attention map to generate photorealistic videos with synchronized lip motion, personalized and natural head poses and eye blinks… section 3.3 with the combination of geometry, texture and illumination coefficients from reference video and generated expression and head pose coefficients from input audio, we can render the 3D face with personalized head movements. 3D model describes the head pose better than 2D methods by rotating and translating the head. (initial drawing), …section 3.3, “Rendering-to-Video Network”, We employ the rendering-to-video network to translate the rendering images into the final photo-realistic images( fine tuning)).
Zhang is considered to be analogous to the combined model of Qu, Teerapittayanon and Alvarez, and the claimed invention, because they focused on multi model system. Therefore it would be obvious to one of the ordinary skill in the art before the effective filing data to have applied Zhang teaching of a face generation method as a model which use a voice and a video to generate a common output of a photo-realistic video of the target face with natural lip motions.
The motivation would have been for a better qualities of video by generating realistic talking face videos with not only synchronized lip motions, but also natural head movements and eye blinks( Zhang, Abstract).
Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable G. Qu, H. Wu, R. Li and P. Jiao, "DMRO: A Deep Meta Reinforcement Learning-Based Task Offloading Framework for Edge-Cloud Computing," in IEEE Transactions on Network and Service Management, vol. 18, no. 3, pp. 3448-3459, Sept. 2021, in the view of S. Teerapittayanon, B. McDanel and H. T. Kung, "Distributed Deep Neural Networks Over the Cloud, the Edge and End Devices," 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS), Atlanta, GA, USA, 2017, in the view of Jose M. Alvarez and Mathieu Salzmann. 2016. Learning the number of neurons in deep networks. In Proceedings of the 30th International Conference on Neural Information Processing Systems (NIPS'16), in the view of C. Zhang et al., "FACIAL: Synthesizing Dynamic Talking Face with Implicit Attribute Learning," 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 2021 further in the view of NAEEM SYED, ADNAN ANWAR, ZUBAIR BAIG, and SHERALI ZEADALLY. "Artificial Intelligence as a Service (AIaaS) for Cloud, Fog and the Edge: State-of-the-Art Practices." (2018).
As of claim 8, the combined model of Qu, Teerapittayanon , Alvarez and Zhang teach all the limitations of claim 7, but the modified does not explicitly teach wherein the audio-based avatar parameter generation module, the avatar video-based avatar parameter generation module, and the finetuning drawing module are arranged in the server, and the initial drawing module is arranged in the edge device.
While Syed teaches wherein the audio-based avatar parameter generation module, the avatar video-based avatar parameter generation module, and the finetuning drawing module are arranged in the server, and the initial drawing module is arranged in the edge device (Section 5.2,
PNG
media_image3.png
650
692
media_image3.png
Greyscale
Fig. 8. AIaaS services for edge/fog computing in a data-driven offloading scenarios or privacy preserving circumstances requiring FL hierarchical aggregation
The offloading algorithms should consider the factors mentioned previously to make the optimal offloading decisions. For efficient processing of these AI related tasks at the edge and fog, the AIaaS platform can be leveraged to provide services such as model training, serving and monitoring (SLA/SLO parameters), pre-trained model access, data storage and management, optimizing models for different processing layers (edge, fog) and data visualization and analytics services for clients…. Some examples of AI based offloading techniques are lightweight AI models that can be trained at the cloud and optimized for edge and deployed at the edge to do simpler inference tasks. For example, TinyML models which can run on constrained devices [102] can be first trained on the cloud AIaaS deployment and then optimized and deployed at the edge to provide inference services)
Syed is considered to be analogues to the combined model and the claim inventions, because they focused on arranging tasks. Therefore it be obvious to try for some one of ordinary skill in the art before the effective failing date using Syed teaching of task of loading on the server and on the edge on the parameters like size, consumption cost as the combined model teaches for better latency and to make the optimal offloading decisions.
The motivation would have been to have optimal offloading decisions by considering factors task complexity, latency, energy consumption, network conditions, device resources and privacy concerns, so that effectively offloading services to be performed either locally at the edges, fog or on the cloud resource(Syed, section 5.2)
Claims 9 and 10 are rejected under 35 U.S.C. 103 as being unpatentable G. Qu, H. Wu, R. Li and P. Jiao, "DMRO: A Deep Meta Reinforcement Learning-Based Task Offloading Framework for Edge-Cloud Computing," in IEEE Transactions on Network and Service Management, vol. 18, no. 3, pp. 3448-3459, Sept. 2021, in the view of S. Teerapittayanon, B. McDanel and H. T. Kung, "Distributed Deep Neural Networks Over the Cloud, the Edge and End Devices," 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS), Atlanta, GA, USA, 2017, in the view of Jose M. Alvarez and Mathieu Salzmann. 2016. Learning the number of neurons in deep networks. In Proceedings of the 30th International Conference on Neural Information Processing Systems (NIPS'16), in the view of C. Zhang et al., "FACIAL: Synthesizing Dynamic Talking Face with Implicit Attribute Learning," 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 2021 in the view of NAEEM SYED, ADNAN ANWAR, ZUBAIR BAIG, and SHERALI ZEADALLY. "Artificial Intelligence as a Service (AIaaS) for Cloud, Fog and the Edge: State-of-the-Art Practices." (2018), further in the view of H. H. Bothe, "Audio to audio-video speech conversion with the help of phonetic knowledge integration," 1997 IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation, Orlando, FL, USA, 1997, pp. 1632-1637 vol.2, further in the view of Waterworth, Nicholas. "Speech Processing in Computer Vision Applications." (2020).
As of claim 9, the combined model Qu, Teerapittayanon , Alvarez, Zhang and Syed teach all the limitations of claim 8, but they do not explicitly teach receiving voice data; converting the voice data into a video for an avatar through the voice-based avatar generation model; generating text information corresponding to the voice data based on the voice data; and displaying the video of the avatar and the text information.
While Bothe teaches receiving voice data; (Fig1 section 1, Whereas with a speech input the facial animation is synchronized with respect to the acoustic signal
PNG
media_image4.png
196
688
media_image4.png
Greyscale
), converting the voice data into a video for an avatar through the voice-based avatar generation model;( Section 1 figure 1. LIPPS converts acoustic speech signals from a microphone input into a facial animation on a computer screen with corresponding movements of the mouth region (lips, teeth, tongue). As it showed above on Fig. 1, a voice was changed to a facial animation.
displaying the video of the avatar and the text information.(section 1,
PNG
media_image5.png
237
363
media_image5.png
Greyscale
The LIPPS terminal is integrated on a multi-media Pentium@computeru nder Windows 9S%s shown in figure 2. It is served by keyboard or mouse. If used as a training aid for lipreading, the user may enter the acoustic utterances to be processed with the help of an integrated microphone or by keyboard, or may store the recorded or processed data on hard disk for later display, … section 2, a facial image of the actual speaker is displayed on the screen).
Bothe is considered to be analogous to the combined model of Qu, Teerapittayanon , Alvarez, Zhang and Syed, and the claim invention, because they focus on model arrangement. There for it would be obvious to one of the ordinary skill in the art before the effective filing data to have applied Bothe teaching of using a voice to create animation on the combined model to visualize the video of avatar.
The motivation would have been by using a phoneme recognizer, converting speech utterances into corresponding sequences of phoneme probability vectors, and a connected phoneme sequence to video conversion, creating a realistic computer animation(Bothe, section 1).
However the Bothe and the combined model of Qu, Teerapittayanon , Alvarez, Zhang and Syed, are not explicitly teach generating text information corresponding to the voice data based on the voice data.
While Nicholas teaches generating text information corresponding to the voice data based on the voice data;( section 1.1, Speech Recognition handles listening to a given speaker and interpreting what words are being said and turning them into their textual representation).
Nicholas is considered to be analogous to the combined model of Qu, Teerapittayanon , Alvarez, Zhang, Syed, and Bothe and the claim invention, because they focus on model analysis of input data. Therefore it would be obvious to one of the ordinary skill in the art before the effective filing data to have applied Nicholas teaching of generating a text from a voice on the target model in order to find a video and voice by using the target model.
The motivation would have been to create a better voice detection by removing noise from the speech sample so that a better textual information can be extracted from the voice using speech recognition ( Nicholas, Conclusion).
As of claim 10, the combined model of Qu, Teerapittayanon , Alvarez, Zhang, Syed, Bothe and Nicholas teach all the limitations of claim 9 and Nicholas also teaches wherein generating the text information comprises: determining a voice feature based on the voice data; and acquiring the text information based on the voice feature ( section 1.1, Speaker Recognition is a class comparison problem to be able to identify a speaker from the characteristics in a given spoken phrase. The work in Speaker Recognition revolves around creating unique representations of speech samples and training networks to compare with extracted features…. Speech Recognition handles listening to a given speaker and interpreting what words are being said and turning them into their textual representation).
Claims 11 -13, 15-16 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over G. Qu, H. Wu, R. Li and P. Jiao, "DMRO: A Deep Meta Reinforcement Learning-Based Task Offloading Framework for Edge-Cloud Computing," in IEEE Transactions on Network and Service Management, vol. 18, no. 3, pp. 3448-3459, Sept. 2021, in the view of S. Teerapittayanon, B. McDanel and H. T. Kung, "Distributed Deep Neural Networks Over the Cloud, the Edge and End Devices," 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS), Atlanta, GA, USA, 2017 further in the view of Lee (US 20200311546 A1 )
As of claim 11, it is on the same scope as claim 1 with additional elements which is taught by Lee electronic device comprising: at least one processor; and a memory coupled to the at least one processor and having instructions stored therein, wherein the instructions, when executed by the at least one processor, cause the electronic device to perform actions (Fig 7, Para 79 9] The processor 710 may perform the deep neural network partitioning function described above with reference to FIG. 3 to FIG. 5 and FIG. 6A to FIG. 6C. Further, the processor 710 may load program instructions to implement the deep neural network partitioning function onto the memory 720, and may control to perform operations described above with reference to FIG. 3 to FIG. 5 and FIG. 6A to FIG. 6C. The program instructions may be stored in the storage device 730 or may be stored in another system connected via a network.)
Lee considered to be analogous to the claim invention and the combined model of claim 1, because they focus on model partitioning. Therefore it would be obvious to someone of ordinary skill in the art to use a processor and a memory to perform the model partitioning as it was claimed above on claim 1.
The motivation would have been to facilitate partitioning since the DNN requires a large amount of computation, storage space, and energy consumption in the operation processes(Lee, Para 3 -10).
As of claim 12, the combined model of Qu, Teerapittayanon and Lee teaches all the limitation of claim 11, and QU also teaches the target model comprises: determining a code of the target model; (Section B, “ Intelligent Offloading Decision-Making”, Besides, Neurosurgeon [30] was a fine-grained partitioning method that can find the optimal dividing point in DNNs according to different factors, and made full use of the resources of cloud servers and mobile devices to minimize the computational delays or energy consumption in IoT environments ).The different factors are considered as a code to find the optimal dividing points in DNNS for model arrangement. Determining a group of neural network models in the target model by analyzing the code ( Section B, “ Intelligent Offloading Decision-Making”, Deep learning methods refer to the classification of the input task information through the multi-layer neural network to determine the final offloading position. Huang et al. [28] provided an algorithm that adopted distributed deep learning to solve the offloading problem of mobile edge networks. It used parallel and distributed DNNs to produce offloading decisions and achieved good results.)
As of Claim 13, the combined model of Qu, Teerapittayanon and Lee teaches all the limitation of claim 12, and Teerapittayanon also teaches determining a task of each neural network model in the group of neural network models;(Section 1, “Introduction, An example of one such distributed approach is to combine a small NN1 model (less number of parameters) on end devices and a larger NN model (more number of parameters) in the cloud. The small model at an end device can quickly perform initial feature extraction, and also classification if the model is confident. Otherwise, the end device can fall back to the large NN model in the cloud, which performs further processing and final classification). determining neural network models that implement the same task as a module.( Section1, “introduction”, Multiple models at the cloud, the edge and the device need to be learned jointly to allow coordinated decision making. Computation already performed on end device models should be useful for further processing on edge or cloud models).
As of claim 15 and 16 the combined model of Qu, Teerapittayanon and Lee teaches all the limitation of claim 11, and Qu also teaches claim 15 and 16, as follow .As of claim 15 and 16 contains a contingent limitations, Qu teaches claim 5 limitation of determining a ratio of the size to the quantity arranging, if the ratio is greater than a threshold, the target module in an edge device, and claim 6 limitation arranging, if the ratio is less than or equal to the threshold, the target module in a server (section A, “Inner Model”, As shown in Fig. 3, the inner model is based on a parallel Deep Reinforcement Learning (DRL) algorithm. We apply a classic reinforcement learning method named Q-learning, in which we input environmental parameters, labeled initial parameters and workflow x into the inner model. We use ai to represent the offloading decision of the i-th subtask of the workflow, which is defined as:
PNG
media_image1.png
103
634
media_image1.png
Greyscale
where ai = 0, 1, and 2 indicate that the i-th subtask is executed locally on the IoT device, the edge server, and the cloud server, respectively). Qu determine an offloading decision for each DNN with different method but the same result of arrange the tasks on edge or cloud server based on threshold.
As of claim 20, , it is on the same scope as claim 11 with additional elements of A computer program product tangibly stored on a non-transitory computer-readable medium and comprising machine-executable instructions, wherein the machine-executable instructions, when executed by a machine, cause the machine to perform a method for model arrangement, while Lee also teaches on para 75 and 79, A computer program(s) may be written in any form of a programming language, including compiled or interpreted languages and may be deployed in any form including a stand-alone program or a module, a component, a subroutine, or other units suitable for use in a computing environment. A computer program may be deployed to be executed on one computer or on multiple computers at one site or distributed across multiple sites and interconnected by a communication network. a non-transitory computer-readable media may be any available media that may be accessed by a computer, and may include both computer storage media and transmission media.
Claim 14 is rejected under 35 U.S.C. 103 as being unpatentable G. Qu, H. Wu, R. Li and P. Jiao, "DMRO: A Deep Meta Reinforcement Learning-Based Task Offloading Framework for Edge-Cloud Computing," in IEEE Transactions on Network and Service Management, vol. 18, no. 3, pp. 3448-3459, Sept. 2021, in the view of S. Teerapittayanon, B. McDanel and H. T. Kung, "Distributed Deep Neural Networks Over the Cloud, the Edge and End Devices," 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS), Atlanta, GA, USA, 2017, in the view of further in the view of Lee (US 20200311546 A1 ), further in the view of Jose M. Alvarez and Mathieu Salzmann. 2016. Learning the number of neurons in deep networks. In Proceedings of the 30th International Conference on Neural Information Processing Systems (NIPS'16).
As of claim 14, the combined model of Qu, Teerapittayanon, Lee teaches all the limitation of claim 11, but they do not explicitly teach, determining the number of neurons in the target module and determining the quantity of the parameters based on the number of neurons.
While Alvarez teaches determining the number of neurons in the target module (Section 3, “Deep model selection: learning with structure sparsity”, We now introduce our approach to automatically determining the number of neurons in each layer of a deep network while learning the network parameters).
determining the quantity of the parameters based on the number of neurons.( Section 3, “Deep model selection: learning with structure sparsity”, A general deep network can be described as a succession of L layers performing linear operations on their input, intertwined with non-linearities, such as Rectified Linear Units (ReLU) or sigmoid, and, potentially, pooling operations. Each layer l consists of Nl neurons, each of which is encoded by parameters θnl = [wnl , bnl ], where wnl is a linear operator acting on the layer’s input and bnl is a bias. Altogether, these parameters form the parameter set Θ = {θl}1≤l≤L, with θl = {θ n l }1≤n≤Nl ).
Alvarez is considered to be analogous to the combined model of Qu, Teerapittayanon, Lee and the claim invention because they use parameters to analyze models. Therefore it would be obvious to one of the ordinary skill in the art before the effective filing data to have applied Alvarez teaching of determining number of neurons as based on the number of neurons determining parameters on the electronic device of the combined model.
The motivation would have been to reduce redundant parameters and improves the accuracy of network by automatically determining the number of neurons, which will help to reduce the memory usage and computational cost in order to keep the model arrangement reasonable (Abstract, Alvarez).
Claim 17 is rejected under 35 U.S.C. 103 as being unpatentable G. Qu, H. Wu, R. Li and P. Jiao, "DMRO: A Deep Meta Reinforcement Learning-Based Task Offloading Framework for Edge-Cloud Computing," in IEEE Transactions on Network and Service Management, vol. 18, no. 3, pp. 3448-3459, Sept. 2021, in the view of S. Teerapittayanon, B. McDanel and H. T. Kung, "Distributed Deep Neural Networks Over the Cloud, the Edge and End Devices," 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS), Atlanta, GA, USA, 2017, in the view of Lee (US 20200311546 A1) in the view of Jose M. Alvarez and Mathieu Salzmann. 2016. Learning the number of neurons in deep networks. In Proceedings of the 30th International Conference on Neural Information Processing Systems (NIPS'16) further in the view of C. Zhang et al., "FACIAL: Synthesizing Dynamic Talking Face with Implicit Attribute Learning," 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 2021
As of claim 17, the combined model Qu, Teerapittayanon, Lee and Alvarez teach all the limitations of claim 16, but they do not explicitly teach wherein the target model is a voice-based avatar generation model, the voice-based avatar generation model comprises an audio-based avatar parameter generation module, an avatar video-based avatar parameter generation module, an initial drawing module, and a fine-tuning drawing module.
While Zhang teaches the target model is a voice-based avatar generation model(Abstract, To model such complicated relationships among different face attributes with input audio, we propose a FACe Implicit Attribute Learning Generative Adversarial Network (FACIAL-GAN), which integrates the phoneticsaware, context-aware, and identity-aware information to synthesize the 3D face animation with realistic motions of lips, head poses, and eye blinks). The voice-based avatar generation model comprises an audio-based avatar parameter generation module, an avatar video-based avatar parameter generation module, an initial drawing module, and a fine-tuning drawing module(Section 2, “Related works”
PNG
media_image2.png
579
1243
media_image2.png
Greyscale
we introduce the FACIAL- GAN module to integrate phonetic, contextual, and personalized information of the talking, and combine the synthesized 3D model with AU attention map to generate photorealistic videos with synchronized lip motion, personalized and natural head poses and eye blinks… section 3.3 with the combination of geometry, texture and illumination coefficients from reference video and generated expression and head pose coefficients from input audio, we can render the 3D face with personalized head movements. 3D model describes the head pose better than 2D methods by rotating and translating the head. (initial drawing), …section 3.3, “Rendering-to-Video Network”, We employ the rendering-to-video network to translate the rendering images into the final photo-realistic images( fine tuning)).
Zhang is considered to be analogous to the combined model of Qu, Teerapittayanon, Lee and Alvarez, and the claimed invention, because they focused on multi model system. Therefore it would be obvious to one of the ordinary skill in the art before the effective filing data to have applied Zhang teaching of a face generation method as a model which use a voice and a video to generate a common output of a photo-realistic video of the target face with natural lip motions using electronic device.
The motivation would have been for a better qualities of video by generating realistic talking face videos with not only synchronized lip motions, but also natural head movements and eye blinks( Zhang, Abstract).
Claim 18 is rejected under 35 U.S.C. 103 as being unpatentable G. Qu, H. Wu, R. Li and P. Jiao, "DMRO: A Deep Meta Reinforcement Learning-Based Task Offloading Framework for Edge-Cloud Computing," in IEEE Transactions on Network and Service Management, vol. 18, no. 3, pp. 3448-3459, Sept. 2021, in the view of S. Teerapittayanon, B. McDanel and H. T. Kung, "Distributed Deep Neural Networks Over the Cloud, the Edge and End Devices," 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS), Atlanta, GA, USA, 2017, in the view of Lee (US 20200311546 A1) in the view of, Jose M. Alvarez and Mathieu Salzmann. 2016. Learning the number of neurons in deep networks. In Proceedings of the 30th International Conference on Neural Information Processing Systems (NIPS'16), in the view of C. Zhang et al., "FACIAL: Synthesizing Dynamic Talking Face with Implicit Attribute Learning," 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 2021 further in the view of NAEEM SYED, ADNAN ANWAR, ZUBAIR BAIG, and SHERALI ZEADALLY. "Artificial Intelligence as a Service (AIaaS) for Cloud, Fog and the Edge: State-of-the-Art Practices." (2018).
As of claim 18, the combined model of Qu, Teerapittayanon ,Lee, Alvarez and Zhang teach all the limitations of claim 17, but the modified model does not explicitly teach wherein the audio-based avatar parameter generation module, the avatar video-based avatar parameter generation module, and the finetuning drawing module are arranged in the server, and the initial drawing module is arranged in the edge device.
While Syed teaches wherein the audio-based avatar parameter generation module, the avatar video-based avatar parameter generation module, and the finetuning drawing module are arranged in the server, and the initial drawing module is arranged in the edge device (Section 5.2,
PNG
media_image3.png
650
692
media_image3.png
Greyscale
Fig. 8. AIaaS services for edge/fog computing in a data-driven offloading scenarios or privacy preserving circumstances requiring FL hierarchical aggregation
The offloading algorithms should consider the factors mentioned previously to make the optimal offloading decisions. For efficient processing of these AI related tasks at the edge and fog, the AIaaS platform can be leveraged to provide services such as model training, serving and monitoring (SLA/SLO parameters), pre-trained model access, data storage and management, optimizing models for different processing layers (edge, fog) and data visualization and analytics services for clients…. Some examples of AI based offloading techniques are lightweight AI models that can be trained at the cloud and optimized for edge and deployed at the edge to do simpler inference tasks. For example, TinyML models which can run on constrained devices [102] can be first trained on the cloud AIaaS deployment and then optimized and deployed at the edge to provide inference services)
Syed is considered to be analogues to the combined model and the claim inventions, because they focused on arranging tasks. Therefore it be obvious to try for some one of ordinary skill in the art before the effective failing date using Syed teaching of task of loading on the server and on the edge on the parameters like size, consumption cost as the combined model teaches for better latency and to make the optimal offloading decisions.
The motivation would have been to have optimal offloading decisions by considering factors task complexity, latency, energy consumption, network conditions, device resources and privacy concerns, so that effectively offloading services to be performed either locally at the edges, fog or on the cloud resource(Syed, section 5.2)
Claim 19 is rejected under 35 U.S.C. 103 as being unpatentable G. Qu, H. Wu, R. Li and P. Jiao, "DMRO: A Deep Meta Reinforcement Learning-Based Task Offloading Framework for Edge-Cloud Computing," in IEEE Transactions on Network and Service Management, vol. 18, no. 3, pp. 3448-3459, Sept. 2021, in the view of S. Teerapittayanon, B. McDanel and H. T. Kung, "Distributed Deep Neural Networks Over the Cloud, the Edge and End Devices," 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS), Atlanta, GA, USA, 2017,in the view of Lee (US 20200311546 A1), in the view of Jose M. Alvarez and Mathieu Salzmann. 2016. Learning the number of neurons in deep networks. In Proceedings of the 30th International Conference on Neural Information Processing Systems (NIPS'16), in the view of C. Zhang et al., "FACIAL: Synthesizing Dynamic Talking Face with Implicit Attribute Learning," 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 2021 in the view of NAEEM SYED, ADNAN ANWAR, ZUBAIR BAIG, and SHERALI ZEADALLY. "Artificial Intelligence as a Service (AIaaS) for Cloud, Fog and the Edge: State-of-the-Art Practices." (2018), further in the view of H. H. Bothe, "Audio to audio-video speech conversion with the help of phonetic knowledge integration," 1997 IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation, Orlando, FL, USA, 1997, pp. 1632-1637 vol.2, further in the view of Waterworth, Nicholas. "Speech Processing in Computer Vision Applications." (2020).
As of claim 19, the combined model Qu, Teerapittayanon ,Lee, Alvarez, Zhang and Syed teach all the limitations of claim 18, but they do not explicitly teach receiving voice data; converting the voice data into a video for an avatar through the voice-based avatar generation model; generating text information corresponding to the voice data based on the voice data; and displaying the video of the avatar and the text information.
While Bothe teaches receiving voice data; (Fig1 section 1, Whereas with a speech input the facial animation is synchronized with respect to the acoustic signal
PNG
media_image4.png
196
688
media_image4.png
Greyscale
), converting the voice data into a video for an avatar through the voice-based avatar generation model;( Section 1 figure 1. LIPPS converts acoustic speech signals from a microphone input into a facial animation on a computer screen with corresponding movements of the mouth region (lips, teeth, tongue). As it showed above on Fig. 1, a voice was changed to a facial animation.
displaying the video of the avatar and the text information.(section 1,
PNG
media_image5.png
237
363
media_image5.png
Greyscale
The LIPPS terminal is integrated on a multi-media Pentium@computeru nder Windows 9S%s shown in figure 2. It is served by keyboard or mouse. If used as a training aid for lipreading, the user may enter the acoustic utterances to be processed with the help of an integrated microphone or by keyboard, or may store the recorded or processed data on hard disk for later display, … section 2, a facial image of the actual speaker is displayed on the screen).
Bothe is considered to be analogous to the combined model of Qu, Teerapittayanon ,Lee, Alvarez, Zhang and Syed, and the claim invention, because they focus on model arrangement. There for it would be obvious to one of the ordinary skill in the art before the effective filing data to have applied Bothe teaching of using a voice to create animation on the combined model to visualize the video of avatar.
The motivation would have been by using a phoneme recognizer, converting speech utterances into corresponding sequences of phoneme probability vectors, and a connected phoneme sequence to video conversion, creating a realistic computer animation(Bothe, section 1).
However the Bothe and the combined model of Qu, Teerapittayanon , Lee, Alvarez, Zhang and Syed, are not explicitly teach generating text information corresponding to the voice data based on the voice data.
While Nicholas teaches generating text information corresponding to the voice data based on the voice data;( section 1.1, Speech Recognition handles listening to a given speaker and interpreting what words are being said and turning them into their textual representation).
Nicholas is considered to be analogous to the combined model of Qu, Teerapittayanon , Lee, Alvarez, Zhang, Syed, and Bothe and the claim invention, because they focus on model analysis of input data. Therefore it would be obvious to one of the ordinary skill in the art before the effective filing data to have applied Nicholas teaching of generating a text from a voice on the target model in order to find a video and voice by using the target model.
The motivation would have been to create a better voice detection by removing noise from the speech sample so that a better textual information can be extracted from the voice using speech recognition ( Nicholas, Conclusion).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Shang; Chong (US 20210027511 A1, Date Published 2021-01-28), this application is similar to the claimed invention since it disclosed a system and method for animation generation using input audio data by generating an output based on the generated final prediction.
Tajima; Kaori (US 20190260827 A1, Date Published 2019-08-22), this application is similar to the claimed invention since it disclosed model partitioning for To appropriately allocate program processing to edge servers in an edge computing system and a backend server.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ABRHAM A. TAMIRU whose telephone number is (571)272-6987. The examiner can normally be reached Monday - Friday 8:00am - 5:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ryan Pitaro can be reached at 571 272 4071. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/A.A.T./Examiner, Art Unit 2188
/RYAN F PITARO/Supervisory Patent Examiner, Art Unit 2188