Last updated: April 19, 2026
Application No. 18/725,707
IMAGE PROCESSING METHOD AND APPARATUS, ELECTRONIC DEVICE, AND STORAGE MEDIUM

Non-Final OA §103
Filed
Jun 28, 2024
Examiner
LIU, GORDON G
Art Unit
2618
Tech Center
2600 — Communications
Assignee
BEIJING ZITIAO NETWORK TECHNOLOGY CO., LTD.
OA Round
1 (Non-Final)
Interview Optional

— +15.1% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 673 resolved cases, 2023–2026
Examiner Intelligence

LIU, GORDON G View full profile →
Grants 83% — above average
Career Allow Rate
556 granted / 673 resolved
+20.6% vs TC avg
Strong +15% interview lift
Without
With
+15.1%
Interview Lift
resolved cases with interview
Typical timeline
2y 4m
Avg Prosecution
29 currently pending
Career history
702
Total Applications
across all art units
Statute-Specific Performance

§101
6.7%
-33.3% vs TC avg
§103
73.3%
+33.3% vs TC avg
§102
3.0%
-37.0% vs TC avg
§112
5.7%
-34.3% vs TC avg
Black line = Tech Center average estimate • Based on career data from 673 resolved cases
Office Action

§103
DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Claims 1-12 and 14-21 are pending under this Office action.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 12, and 14-15 are rejected under 35 U.S.C. 103 as being unpatentable over Aslan, etc. (US 20170132528 A1) in view of Kang, etc. (US 20230132630 A1).
Regarding claim 1, Aslan teaches that an image processing method (See Aslan: Fig. 1, and [0021], “FIG. 1 is a schematic diagram of an example technique for jointly training multiple machine learning models. FIG. 1 illustrates a first machine learning model 100 and a second machine learning model 102 that make up a set of machine learning models that are to be trained in parallel, according to the techniques and systems described herein. In FIG. 1, the first machine learning model 100 is denoted as a “teacher machine learning model” or “teacher model,” and the second machine learning model 102 is denoted as a “student machine learning model” or “student model.” Calling the first model 100 a “teacher model” and the second model 102 a “student model” is somewhat arbitrary because either model can be capable of learning from the other. The notion of a “teacher model” is one where the teacher influences the training of the student (i.e., the student learns, at least partly, from the teacher)”), comprising:
acquiring an original image to be processed (See Aslan: Fig. 1, and [0024], “The training data 104 can be stored in a database or repository of any suitable data, such as image data, speech data, text data, video data, or any other suitable type of data that can be processed by the machine learning models 100 and 102. For example, the training data 104 can comprise a repository of images that are to be classified or labeled by the machine learning models 100 and/or 102. The training data 104 can further include at least two additional components: features and labels. However, the training data 104 may be unlabeled in some implementations, such that the machine learning models 100 and/or 102 can be trained using any suitable learning technique, such as supervised learning, unsupervised learning, semi-supervised learning, reinforcement learning, and so on”. Note that retrieving the image data and video data from the database is mapped to acquiring an original image to be processed);
inputting the original image into a first image processing model (See Aslan: Fig. 1, and [0023], Thus, although FIG. 1 shows both models 100 and 102 as explicitly receiving, or having access to, the training data 104, it is to be appreciated that any individual machine learning model shown in the Figures and described herein can receive, or have access to, at least some of the training data 104 in particular implementations, even if an explicit connection between an individual model and the training data is not depicted in the Figures”. Note that the input image data in 104 can be received and accessed by both models 100 and 102, Model 100 (teacher model) is referred as a first model, and Model 102 (student model) is referred as model 2 in Aslan, for mapping purpose, the student model 102 will be mapped to the first model, the teacher model 100 will be mapped to the second model,, and the student model 102 receives image data from database 104 is mapped to “inputting the original image into a first image processing model”);
processing the original image by the first image processing model to generate a target image, 
wherein the first image processing model and a second image processing model are obtained by online alternate training (See Aslan: Fig. 1, and [0026], “Joint training of the first model 100 and the second model 102 involves training the models 100 and 102 in parallel such that at least one of the models 100 and/or 102 influences the training of the other model. For example, the first model 100 can learn from the training data 104, and the training of the second model 102 can be influenced by what the first model 100 is learning from the training data 104 while the first model 100 is being trained, and/or before the first model 100 completes its training. In this sense, the second (student) model 102 can be considered to be learning from the first (teacher) model 100 as the first model 100 learns. The aforementioned scenario is depicted visually in FIG. 1 by the path 106 that goes from the training data 104 to the first model 100, and from the first model 100 to the second model 102”; and [0032], “In the implementation where the two models 100 and 102 collaborate with each other during joint training (shown via the path 110 in FIG. 1), the models 100 and 102 can process any suitable unlabeled data. For example, a billion unknown images can be downloaded from a database of images on the Web, or, alternatively, the training data 104 can be utilized by “throwing away” labels, if necessary, and processing the unlabeled training data 104. The objective function used for joint training can be formulated in a way to effectively allow the two models 100 and 102 to collaborate and discuss their respective predictions with each other (via the path 110) to help each model learn how the other model thinks, which factors into its own training. For instance, the first model 100 can predict that an unknown image is a cat with 0.9 probability, while the second model 102 predicts that the same unknown image is a cat with 0.6 probability and a dog with 0.3 probability. This information can be passed between the models 100 and 102 via the path 110 during joint training by virtue of terms included in the objective function for both models”. Note that the joint training of the student and teacher model, as shown in path 110, one model results are pushed to another model during training, is mapped to alternate training of the first and second image processing model; while the fact that training data is downloaded from the Web is mapped to the “online” training, that is using the online data to train the models), 
supervision information during training process of the first image processing model comprises at least part of images generated by the second image processing model during training process (See Aslan: Fig. 1, and [0023], “In some implementations, the second model 102 can access some data for joint training purposes, and the second model 102 can access other new data that is inaccessible to the first model 100 when the first model 100 is training, but accessible to the first model 100 when the first model 100 passes output to the second model 102. “Passing information,” in this sense, is described in more detail below”. Note that the first model 100 (teacher) passes output to the second model (student) 102 is mapped to at least part of images generated by the second image processing model (teacher model) during training process. Note that the teacher model of Aslan is the second model of the instant application), and 
a model scale of the first image processing model is smaller than a model scale of the second image processing model (See Aslan: Fig. 4, and [0054], “FIG. 4 also shows that training data 404 can be used to train one or more of the machine learning models of FIG. 4, such as the teacher model 400. FIG. 4 also indicates that the P student models 402 can decrease in size from 402(1) to 402(P) in terms of the amount of memory to store each of the student models 402 in the set of P student models 402. This can be beneficial if the last student model 402(P) in the chain of student models 402 is to be deployed on a mobile device with limited memory and/or processing power, and instead of going straight from a potentially very large teacher model 400 to a single student model 402(P) that is small enough to deploy, as might be the case with the example of FIG. 1, the implementations of FIG. 4 allows for model compression from a relatively large teacher model 400, to a slightly smaller student model 402(1), and then to a slightly smaller student model 402(2), and so on. Eventually, the joint model training results in a trained student model 402(P) that is a compressed form of the teacher model 400, and the student model 402(P) can be deployed on a computing device with limited resources. It is to be appreciated, however, that the machine learning models of FIG. 4 can be of the same, or similar size, while differing in architecture, for example, without departing from the basic nature of the joint training techniques disclosed herein”, Note that the student model is compressed version of the teacher model, and the student model can be compressed to such a small version that it can be deployed to mobile device. This is mapped to the first mode scale is smaller than the first model (teacher) scale); and
outputting the target image.
However, Aslan fails to explicitly disclose that processing the original image by the first image processing model to generate a target image; and outputting the target image.
However, Kang teaches that processing the original image by the first image processing model to generate a target image (See Kang: Figs. 6-7, and [0112], “As illustrated in FIG. 7, an example may be a process of generating an image of staining on a horse included in the image. The input image may be a nature image that includes a horse that is a subject to be deformed, and the image generating apparatus may extract a region of a horse recognized from the input image to input the corresponding region to the image generation network (Student Decoder 1) trained by a knowledge distillation scheme based on the energy-based model”. Note that the student model (mapped to the first model) has been trained using the distillation method, and the input image is processed by the student model to generate a final image with modification needed, the output image is mapped to the target image); and 
outputting the target image (See Kang: Fig. 7, and [0114], “Images that are output from each image generation network may be substituted into the corresponding regions and may be summed. As described in an example, an output image in which a horse is stained may be obtained corresponding to the input image”. Note that the final output image from the student models is mapped to outputting the target image).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention was effectively filed to modify Aslan to have processing the original image by the first image processing model to generate a target image; and outputting the target image as taught by Kang in order to reduce structural similarity loss, a perceptual loss, and a style loss (See Kang: Fig. 2, and [0006], “In addition, knowledge distillation for a final output may reduce the L2 norm between the teacher network and the student network, or minimize a structural similarity loss, a perceptual loss, and a style loss”). Aslan teaches a method and system that may jointly train the teach model and the student model, and transfer the teach model parameters to the student model in compressed faction; while Kang teaches a system and method that may train the teacher model, transfer knowledge of the teacher model to the student model, and use the student model to generate and output the target images by processing the input images. Therefore, it is obvious to one of ordinary skill in the art to modify Aslan by Kang to process the input image and output the processed target image using the student model after jointly training of the teacher and student models. The motivation to modify Aslan by Kang is “Use of known technique to improve similar devices (methods, or products) in the same way”.
Regarding claim 12, Aslan and Kang teach all the features with respect to claim 1 as outlined above. Further, Aslan teaches that the method according to claim 1, wherein the first image processing model is applied to a lightweight terminal device (See Aslan: Fig. 1, and [0044], “] The joint training techniques described herein can be used for various applications. One example application is model compression, which allows for compact representations of deep (i.e., many layers) machine learning models that generally are allocated a large amount of memory to maintain, are complex in architecture, and use a high amount of processing power to operate at runtime. For example, the first (teacher) model 100 of FIG. 1 can comprise a large, complex ensemble of machine learning models that is often too large and/or slow to be used at run-time in particular scenarios. Meanwhile, the second (student) model 102 can comprise a much smaller machine learning model (e.g., a neural net with 1000 times fewer parameters than the first model 100) that has the size and/or speed that is advantageous at run-time in particular scenarios. By joint training the first and second models 100 and 102 using the techniques and systems described herein, the second model 102 can be trained to mimic the much larger first model 100 (through learning how to approximate the function learned by the first model 100) without significant loss in accuracy of the second model's 102 output. Because the smaller second model 102 take much less memory to maintain and can operate faster on less processing power at runtime, the second model 102 can be a compressed form of the larger first model 100 such that the second model 102 can be more readily deployed on computing devices with limited resources (e.g., mobile devices, wearables, etc.)”. The student model is deployed to the mobile device (mapped to the lightweight terminal device), and this is mapped to the cited limitation of “wherein the first image processing model is applied to a lightweight terminal device”).
Regarding claim 14, Aslan and Kang teach all the features with respect to claim 1 as outlined above. Further, Aslan teaches that an electronic device (See Aslan: Fig. 1, and [0021], “FIG. 1 is a schematic diagram of an example technique for jointly training multiple machine learning models. FIG. 1 illustrates a first machine learning model 100 and a second machine learning model 102 that make up a set of machine learning models that are to be trained in parallel, according to the techniques and systems described herein. In FIG. 1, the first machine learning model 100 is denoted as a “teacher machine learning model” or “teacher model,” and the second machine learning model 102 is denoted as a “student machine learning model” or “student model.” Calling the first model 100 a “teacher model” and the second model 102 a “student model” is somewhat arbitrary because either model can be capable of learning from the other. The notion of a “teacher model” is one where the teacher influences the training of the student (i.e., the student learns, at least partly, from the teacher)”), comprising:
one or more processors (See Aslan: Fig. 8, and [0066], “The processor(s) 804 can be configured to execute instructions, applications, or programs stored in the memory 806. In some implementations, the processor(s) 804 can include hardware processors that include, without limitation, a hardware central processing unit (CPU), a field programmable gate array (FPGA), a complex programmable logic device (CPLD), an application specific integrated circuit (ASIC), a system-on-chip (SoC), or a combination thereof”); and
a storage apparatus, configured to store one or more programs (See Aslan: Fig. 8, and [0066], “Depending on the exact configuration and type of computing device, the memory 806 can be volatile (e.g., random access memory (RAM)), non-volatile (e.g., read only memory (ROM), flash memory, etc.), or some combination of the two”),
wherein when the one or more programs are executed by the one or more processors, the one or more processors are caused to implement an image processing method according to any one of claims 1-12 (See Aslan: Fig. 8, and [0066], “The memory 806 can include machine learning training module 808, a scheduling module 810, one or more program modules 812 or application programs, and program data 814 accessible to the processor(s) 804”; and [0069], “In some implementations, any or all of the memory 806, removable storage 816, and non-removable storage 818 can store programming instructions, data structures, program modules and other data, which, when executed by the processor(s) 804, implement some or all of the processes described herein”), which comprises:
acquiring an original image to be processed (See Aslan: Fig. 1, and [0024], “The training data 104 can be stored in a database or repository of any suitable data, such as image data, speech data, text data, video data, or any other suitable type of data that can be processed by the machine learning models 100 and 102. For example, the training data 104 can comprise a repository of images that are to be classified or labeled by the machine learning models 100 and/or 102. The training data 104 can further include at least two additional components: features and labels. However, the training data 104 may be unlabeled in some implementations, such that the machine learning models 100 and/or 102 can be trained using any suitable learning technique, such as supervised learning, unsupervised learning, semi-supervised learning, reinforcement learning, and so on”. Note that retrieving the image data and video data from the database is mapped to acquiring an original image to be processed);
inputting the original image into a first image processing model (See Aslan: Fig. 1, and [0023], Thus, although FIG. 1 shows both models 100 and 102 as explicitly receiving, or having access to, the training data 104, it is to be appreciated that any individual machine learning model shown in the Figures and described herein can receive, or have access to, at least some of the training data 104 in particular implementations, even if an explicit connection between an individual model and the training data is not depicted in the Figures”. Note that the input image data in 104 can be received and accessed by both models 100 and 102, Model 100 (teacher model) is referred as a first model, and Model 102 (student model) is referred as model 2 in Aslan, for mapping purpose, the student model 102 will be mapped to the first model, the teacher model 100 will be mapped to the second model,, and the student model 102 receives image data from database 104 is mapped to “inputting the original image into a first image processing model”);
processing the original image by the first image processing model to generate a target image (See Kang: Figs. 6-7, and [0112], “As illustrated in FIG. 7, an example may be a process of generating an image of staining on a horse included in the image. The input image may be a nature image that includes a horse that is a subject to be deformed, and the image generating apparatus may extract a region of a horse recognized from the input image to input the corresponding region to the image generation network (Student Decoder 1) trained by a knowledge distillation scheme based on the energy-based model”. Note that the student model (mapped to the first model) has been trained using the distillation method, and the input image is processed by the student model to generate a final image with modification needed, the output image is mapped to the target image),  
wherein the first image processing model and a second image processing model are obtained by online alternate training (See Aslan: Fig. 1, and [0026], “Joint training of the first model 100 and the second model 102 involves training the models 100 and 102 in parallel such that at least one of the models 100 and/or 102 influences the training of the other model. For example, the first model 100 can learn from the training data 104, and the training of the second model 102 can be influenced by what the first model 100 is learning from the training data 104 while the first model 100 is being trained, and/or before the first model 100 completes its training. In this sense, the second (student) model 102 can be considered to be learning from the first (teacher) model 100 as the first model 100 learns. The aforementioned scenario is depicted visually in FIG. 1 by the path 106 that goes from the training data 104 to the first model 100, and from the first model 100 to the second model 102”; and [0032], “In the implementation where the two models 100 and 102 collaborate with each other during joint training (shown via the path 110 in FIG. 1), the models 100 and 102 can process any suitable unlabeled data. For example, a billion unknown images can be downloaded from a database of images on the Web, or, alternatively, the training data 104 can be utilized by “throwing away” labels, if necessary, and processing the unlabeled training data 104. The objective function used for joint training can be formulated in a way to effectively allow the two models 100 and 102 to collaborate and discuss their respective predictions with each other (via the path 110) to help each model learn how the other model thinks, which factors into its own training. For instance, the first model 100 can predict that an unknown image is a cat with 0.9 probability, while the second model 102 predicts that the same unknown image is a cat with 0.6 probability and a dog with 0.3 probability. This information can be passed between the models 100 and 102 via the path 110 during joint training by virtue of terms included in the objective function for both models”. Note that the joint training of the student and teacher model, as shown in path 110, one model results are pushed to another model during training, is mapped to alternate training of the first and second image processing model; while the fact that training data is downloaded from the Web is mapped to the “online” training, that is using the online data to train the models), 
supervision information during training process of the first image processing model comprises at least part of images generated by the second image processing model during training process (See Aslan: Fig. 1, and [0023], “In some implementations, the second model 102 can access some data for joint training purposes, and the second model 102 can access other new data that is inaccessible to the first model 100 when the first model 100 is training, but accessible to the first model 100 when the first model 100 passes output to the second model 102. “Passing information,” in this sense, is described in more detail below”. Note that the first model 100 (teacher) passes output to the second model (student) 102 is mapped to at least part of images generated by the second image processing model (teacher model) during training process. Note that the teacher model of Aslan is the second model of the instant application), and 
a model scale of the first image processing model is smaller than a model scale of the second image processing model (See Aslan: Fig. 4, and [0054], “FIG. 4 also shows that training data 404 can be used to train one or more of the machine learning models of FIG. 4, such as the teacher model 400. FIG. 4 also indicates that the P student models 402 can decrease in size from 402(1) to 402(P) in terms of the amount of memory to store each of the student models 402 in the set of P student models 402. This can be beneficial if the last student model 402(P) in the chain of student models 402 is to be deployed on a mobile device with limited memory and/or processing power, and instead of going straight from a potentially very large teacher model 400 to a single student model 402(P) that is small enough to deploy, as might be the case with the example of FIG. 1, the implementations of FIG. 4 allows for model compression from a relatively large teacher model 400, to a slightly smaller student model 402(1), and then to a slightly smaller student model 402(2), and so on. Eventually, the joint model training results in a trained student model 402(P) that is a compressed form of the teacher model 400, and the student model 402(P) can be deployed on a computing device with limited resources. It is to be appreciated, however, that the machine learning models of FIG. 4 can be of the same, or similar size, while differing in architecture, for example, without departing from the basic nature of the joint training techniques disclosed herein”, Note that the student model is compressed version of the teacher model, and the student model can be compressed to such a small version that it can be deployed to mobile device. This is mapped to the first mode scale is smaller than the first model (teacher) scale); and
outputting the target image (See Kang: Fig. 7, and [0114], “Images that are output from each image generation network may be substituted into the corresponding regions and may be summed. As described in an example, an output image in which a horse is stained may be obtained corresponding to the input image”. Note that the final output image from the student models is mapped to outputting the target image). 
Regarding claim 15, Aslan and Kang teach all the features with respect to claim 1 as outlined above. Further, Aslan and Kang teach that a non-transitory computer-readable storage medium, comprising computer-executable instructions, wherein the computer-executable instructions, when executed by a computer processor, is configured to perform an image processing method according to any one of claims 1-12 (See Aslan: Fig. 1, and [0021], “FIG. 1 is a schematic diagram of an example technique for jointly training multiple machine learning models. FIG. 1 illustrates a first machine learning model 100 and a second machine learning model 102 that make up a set of machine learning models that are to be trained in parallel, according to the techniques and systems described herein. In FIG. 1, the first machine learning model 100 is denoted as a “teacher machine learning model” or “teacher model,” and the second machine learning model 102 is denoted as a “student machine learning model” or “student model.” Calling the first model 100 a “teacher model” and the second model 102 a “student model” is somewhat arbitrary because either model can be capable of learning from the other. The notion of a “teacher model” is one where the teacher influences the training of the student (i.e., the student learns, at least partly, from the teacher)”), which comprises:
acquiring an original image to be processed (See Aslan: Fig. 1, and [0024], “The training data 104 can be stored in a database or repository of any suitable data, such as image data, speech data, text data, video data, or any other suitable type of data that can be processed by the machine learning models 100 and 102. For example, the training data 104 can comprise a repository of images that are to be classified or labeled by the machine learning models 100 and/or 102. The training data 104 can further include at least two additional components: features and labels. However, the training data 104 may be unlabeled in some implementations, such that the machine learning models 100 and/or 102 can be trained using any suitable learning technique, such as supervised learning, unsupervised learning, semi-supervised learning, reinforcement learning, and so on”. Note that retrieving the image data and video data from the database is mapped to acquiring an original image to be processed);
inputting the original image into a first image processing model (See Aslan: Fig. 1, and [0023], Thus, although FIG. 1 shows both models 100 and 102 as explicitly receiving, or having access to, the training data 104, it is to be appreciated that any individual machine learning model shown in the Figures and described herein can receive, or have access to, at least some of the training data 104 in particular implementations, even if an explicit connection between an individual model and the training data is not depicted in the Figures”. Note that the input image data in 104 can be received and accessed by both models 100 and 102, Model 100 (teacher model) is referred as a first model, and Model 102 (student model) is referred as model 2 in Aslan, for mapping purpose, the student model 102 will be mapped to the first model, the teacher model 100 will be mapped to the second model,, and the student model 102 receives image data from database 104 is mapped to “inputting the original image into a first image processing model”);
processing the original image by the first image processing model to generate a target image (See Kang: Figs. 6-7, and [0112], “As illustrated in FIG. 7, an example may be a process of generating an image of staining on a horse included in the image. The input image may be a nature image that includes a horse that is a subject to be deformed, and the image generating apparatus may extract a region of a horse recognized from the input image to input the corresponding region to the image generation network (Student Decoder 1) trained by a knowledge distillation scheme based on the energy-based model”. Note that the student model (mapped to the first model) has been trained using the distillation method, and the input image is processed by the student model to generate a final image with modification needed, the output image is mapped to the target image), 
wherein the first image processing model and a second image processing model are obtained by online alternate training (See Aslan: Fig. 1, and [0026], “Joint training of the first model 100 and the second model 102 involves training the models 100 and 102 in parallel such that at least one of the models 100 and/or 102 influences the training of the other model. For example, the first model 100 can learn from the training data 104, and the training of the second model 102 can be influenced by what the first model 100 is learning from the training data 104 while the first model 100 is being trained, and/or before the first model 100 completes its training. In this sense, the second (student) model 102 can be considered to be learning from the first (teacher) model 100 as the first model 100 learns. The aforementioned scenario is depicted visually in FIG. 1 by the path 106 that goes from the training data 104 to the first model 100, and from the first model 100 to the second model 102”; and [0032], “In the implementation where the two models 100 and 102 collaborate with each other during joint training (shown via the path 110 in FIG. 1), the models 100 and 102 can process any suitable unlabeled data. For example, a billion unknown images can be downloaded from a database of images on the Web, or, alternatively, the training data 104 can be utilized by “throwing away” labels, if necessary, and processing the unlabeled training data 104. The objective function used for joint training can be formulated in a way to effectively allow the two models 100 and 102 to collaborate and discuss their respective predictions with each other (via the path 110) to help each model learn how the other model thinks, which factors into its own training. For instance, the first model 100 can predict that an unknown image is a cat with 0.9 probability, while the second model 102 predicts that the same unknown image is a cat with 0.6 probability and a dog with 0.3 probability. This information can be passed between the models 100 and 102 via the path 110 during joint training by virtue of terms included in the objective function for both models”. Note that the joint training of the student and teacher model, as shown in path 110, one model results are pushed to another model during training, is mapped to alternate training of the first and second image processing model; while the fact that training data is downloaded from the Web is mapped to the “online” training, that is using the online data to train the models), 
supervision information during training process of the first image processing model comprises at least part of images generated by the second image processing model during training process (See Aslan: Fig. 1, and [0023], “In some implementations, the second model 102 can access some data for joint training purposes, and the second model 102 can access other new data that is inaccessible to the first model 100 when the first model 100 is training, but accessible to the first model 100 when the first model 100 passes output to the second model 102. “Passing information,” in this sense, is described in more detail below”. Note that the first model 100 (teacher) passes output to the second model (student) 102 is mapped to at least part of images generated by the second image processing model (teacher model) during training process. Note that the teacher model of Aslan is the second model of the instant application), and 
a model scale of the first image processing model is smaller than a model scale of the second image processing model (See Aslan: Fig. 4, and [0054], “FIG. 4 also shows that training data 404 can be used to train one or more of the machine learning models of FIG. 4, such as the teacher model 400. FIG. 4 also indicates that the P student models 402 can decrease in size from 402(1) to 402(P) in terms of the amount of memory to store each of the student models 402 in the set of P student models 402. This can be beneficial if the last student model 402(P) in the chain of student models 402 is to be deployed on a mobile device with limited memory and/or processing power, and instead of going straight from a potentially very large teacher model 400 to a single student model 402(P) that is small enough to deploy, as might be the case with the example of FIG. 1, the implementations of FIG. 4 allows for model compression from a relatively large teacher model 400, to a slightly smaller student model 402(1), and then to a slightly smaller student model 402(2), and so on. Eventually, the joint model training results in a trained student model 402(P) that is a compressed form of the teacher model 400, and the student model 402(P) can be deployed on a computing device with limited resources. It is to be appreciated, however, that the machine learning models of FIG. 4 can be of the same, or similar size, while differing in architecture, for example, without departing from the basic nature of the joint training techniques disclosed herein”, Note that the student model is compressed version of the teacher model, and the student model can be compressed to such a small version that it can be deployed to mobile device. This is mapped to the first mode scale is smaller than the first model (teacher) scale); and
outputting the target image (See Kang: Fig. 7, and [0114], “Images that are output from each image generation network may be substituted into the corresponding regions and may be summed. As described in an example, an output image in which a horse is stained may be obtained corresponding to the input image”. Note that the final output image from the student models is mapped to outputting the target image).

Allowable Subject Matter
Claims 2-10 and 16-21 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims. The best arts searched do not teach the cited limitations of “The method according to claim 1, wherein the second image processing model is trained based on true-label data pairs during training process, a pseudo-label image is generated according to an unlabeled sample, and the pseudo-label image is generated based on steps comprising: acquiring an unlabeled sample as an input to the second image processing model during a process of the second image processing model and a discriminator performing adversarial training based on the true-label data pairs; generating candidate pseudo-label images according to the unlabeled sample by the second image processing model; and screening the candidate pseudo-label images by the discriminator to obtain a final pseudo-label image.”
Claims 11 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims. The best arts searched do not teach the cited limitations of “The method according to claim 1, wherein the images generated by the second image processing model during training process further comprise a first image generated by the second image processing model according to a labeled sample among true-label data pairs; and if the first image processing model takes the first image as the supervision information, the first image processing model is trained based on steps comprising: acquiring a labeled sample corresponding to the first image as an input of the first image processing model; generating a third image according to the labeled sample by the first image processing model; determining a distillation loss according to the first image and the third image; and training the first image processing model according to the distillation loss.”

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to GORDON G LIU whose telephone number is (571)270-0382. The examiner can normally be reached Monday - Friday 8:00-5:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Devona E Faulk can be reached at 571-272-7515. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/GORDON G LIU/Primary Examiner, Art Unit 2618
Read full office action
Prosecution Timeline

Jun 28, 2024
Application Filed
Jan 24, 2026
Non-Final Rejection — §103
Jan 30, 2026
Examiner Interview (Telephonic)
Precedent Cases

Applications granted by this same examiner with similar technology

18/236,346
Patent 12602846
GENERATING REALISTIC MACHINE LEARNING-BASED PRODUCT IMAGES FOR ONLINE CATALOGS
2y 5m to grant Granted Apr 14, 2026
18/442,998
Patent 12602840
IMAGE PROCESSING SYSTEM, IMAGE PROCESSING METHOD, AND STORAGE MEDIUM
2y 5m to grant Granted Apr 14, 2026
18/468,209
Patent 12602871
MESH TOPOLOGY GENERATION USING PARALLEL PROCESSING
2y 5m to grant Granted Apr 14, 2026
18/527,183
Patent 12592022
INTEGRATION CACHE FOR THREE-DIMENSIONAL (3D) RECONSTRUCTION
2y 5m to grant Granted Mar 31, 2026
18/014,973
Patent 12586330
DISPLAYING A VIRTUAL OBJECT IN A REAL-LIFE SCENE
2y 5m to grant Granted Mar 24, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
83%
Grant Probability
98%
With Interview (+15.1%)
2y 4m
Median Time to Grant
Low
PTA Risk
Based on 673 resolved cases by this examiner. Grant probability derived from career allow rate.
IMAGE PROCESSING METHOD AND APPARATUS, ELECTRONIC DEVICE, AND STORAGE MEDIUM

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email