Office Action Analysis: 18265710 — IMAGE PROCESSING METHOD AND DEVICE, STORAGE MEDIUM AND ELECTRONIC DEVICE

Office Action

§103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment

Amendments made to the specification have overcome all previously held objections. 
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-4, 8-10 and 18-19 are rejected under 35 U.S.C. 103 as being unpatentable over Otsuka (WO 2021065633 A1) in view of Yang (CN 109547786 A) and Pasteris (US 9204112 B2).
With respect to claim 1, Otsuka teaches an image processing method, comprising: acquiring a current frame image (“the image generation unit 420 acquires or generates frames of a plurality of moving images in which at least a part of the object to be represented is common at a predetermined or variable rate, and sequentially supplies the frames to the compression coding unit 422.” Page 30 paragraph 6 lines 1-3), and performing semantic feature extraction processing on the current frame image to obtain a semantic feature set of the current frame image (“… the image content acquisition unit 450 determines whether or not the scene is switched, the type of image texture displayed in the frame, the distribution of feature points, depth information, the amount of objects, and the mipmap texture used for three-dimensional graphics. Information such as the usage amount of each level, LOD (Level Of Detail), usage amount of each level of tessellation, amount of characters and symbols, type of scene to be represented, and the like can be obtained from the image generation unit 420.” Page 40 paragraph 4), generating a compressed information packet according to the semantic feature set of the current frame image (see 
    PNG
    media_image1.png
    598
    1000
    media_image1.png
    Greyscale
Translated Figure 26 elements 422, 450, and 424)
 and transmitting the compressed information packet (see 
    PNG
    media_image1.png
    598
    1000
    media_image1.png
    Greyscale
Translated Figure 26 elements 422, 450, 424 and 200; “the compression-encoded partial image data is transmitted to the image processing device 200 via the communication unit 426” page 45 paragraph 4 lines 1-2 ).
Otsuka does not teach determining a historical frame image matched with the current frame image, and acquiring frame number information of the historical frame image, wherein the historical frame image matched with the current frame image is a snapshot corresponding to the current frame image; and generating a compressed information packet according to the semantic feature set of the current frame image and the frame number information of the historical frame image, and storing and/or transmitting the compressed information packet.
Yang teaches determining a historical frame image matched with the current frame image (“obtained by the reference image of the intra-coded frame with the preset reference image library for comparing, determining a difference frame and marking information;” page 3 lines 11-12), and acquiring frame number information of the historical frame image (“determining a difference frame and marking information; The coding frame and the difference frame in the frame, determining the residual value; the coding the residual value; outputting the encoded residual value and identification information of the difference frame.” Page 3 lines 12-15), and generating a compressed information packet according to an image feature of the current frame image and the frame number information of the historical frame image (“determining a difference frame and marking information; The coding frame and the difference frame in the frame, determining the residual value; the coding the residual value; outputting the encoded residual value and identification information of the difference frame.” Page 3 lines 12-15), and storing and/or transmitting the compressed information packet (“determining a difference frame and marking information; The coding frame and the difference frame in the frame, determining the residual value; the coding the residual value; outputting the encoded residual value and identification information of the difference frame.”  Page 3 lines 12-15 And “receiving the code of the reference image frame and the reference image frame identification information;” page 4 line 18).
Yang is analogous art in the same field of endeavor as the claimed invention. Yang is directed towards video coding and decoding (“The invention relates to the field of video coding and decoding…” page 1 Technical Field line 1). A person of ordinary skill in the art would have found it obvious before the effective filing date of the claimed invention to combine the teachings of Otsuka and Yang by incorporating the refence frame identification coding process of Yang into the feature set packeting process of Otsuka with the expectation that doing so would lead to improvements in the compression rate and the avoidance of other video or repeated image compression (“improving the compression rate, and can solve the problem of repeated coding, repeated compression” page 3 lines 5-6)
Pasteris teaches an image processing method, comprising: acquiring a current frame image (see figure 2 element 202), and performing semantic feature extraction processing on the current frame image to obtain a semantic feature set of the current frame image (“In the pipeline 200, an image is captured through an image capture device 202 and a key points detector 204 then identifies or extracts key points in the image and supplies these key points to a clustering module 206. The clustering module 206 then performs a grouping or clustering process on the extracted key points, or the corresponding feature descriptors, to generate clusters of features or feature descriptors” page 9 col 4 lines 44-50); determining a historical frame image matched with the current frame image, wherein the historical frame image matched with the current frame image is a snapshot corresponding to the current frame image (“The server then performs additional visual searching through a features matching module 228 that uses the compressed feature descriptors and the matched cluster data and corresponding images to thereby identify reference images in the database 214 that correspond to the image captured by the device 202” page 10 col 5 lines 32-36); 
Pasteris is analogous art in the same field of endeavor as the claimed invention. Pasteris is directed towards encoding image features (“In the pipeline 200, an image is captured through an image capture device 202 and a key points detector 204 then identifies or extracts key points in the image and supplies these key points to a clustering module 206. The clustering module 206 then performs a grouping or clustering process on the extracted key points, or the corresponding feature descriptors, to generate clusters of features or feature descriptors” page 9 col 4 lines 44-50). A person of ordinary skill in the art would have found it obvious before the effective filing date of the claimed invention to combine the process of Otsuka, Yang, and Pasteris by utilizing Pasteris’ image matching strategy in place of Yang’s with the expectation that doing so would enable the system to more accurately and efficiently find corresponding images during the matching process (“Embodiments of the present disclosure relate generally to visual search systems and, more specifically to systems, circuits, and methods that group image feature descriptors of a captured scene into clusters to improve matching with reference images and the efficiency of transmission of such image feature descriptors.” Page 8 Technical Field).
With respect to claim 2, Otsuka, Yang and Pasteris teach the image processing method of claim 1. Otsuka further teaches acquiring the semantic feature set of the current frame image from the compressed information packet (“When the compression coding unit 422 of the server 400 lowers the resolution, the server 400 may also transmit additional data that cannot be specified only by the low resolution image. Here, the additional data includes, for example, a feature amount in the original image generated by the image generation unit 420 and various parameters determined by the compression coding unit 422 at the time of compression coding. The feature amount may include at least one of the feature points of the original image, the edge strength, the depth of each pixel contained in the original image, the type of texture, the optical flow, and the motion estimation information. Alternatively, the additional data may include data indicating an object represented by the original image, which is specified by the object recognition process performed by the compression coding unit 422.” Page 45 paragraph 5 lines 4-5 and paragraph 6 lines 1-7) and performing image reconstruction according to the semantic feature set of the current frame image (“In this case, the decoding / stretching unit 242 of the image processing device 200 accurately generates a high-resolution display image based on the transmitted image data and additional data.” Page 45 paragraph 6 lines 7-8) to obtain a decompressed image corresponding to the current frame image (“…a high-resolution display image based on the transmitted image data and additional data.” Page 45 paragraph 6 lines 7-8).
Yang teaches after storing the compressed information packet, further: acquiring the frame number information of the historical frame image from the compressed information packet (“receiving the video code stream, and performing standard decoding to the video code stream to obtain the video frame;” page 4 lines 11-12); and acquiring the historical frame image from a historical frame library according to the frame number information of the historical frame image (“receiving the code of the reference image frame and the reference image frame identification information; for decoding the reference image frame, obtaining a reference image;” page 4 lines  18-19), and performing image reconstruction according to the historical frame image to obtain a decompressed image corresponding to the current frame image (“The basic idea the encoding and decoding method using inter prediction coding of the original base frame intra-frame prediction coding or other complex code to simplify processing, under the condition of limited of video scene changes, only needs to encode the difference part and there is no need to compress the whole image, and decoding only needs to simply added to the reconstructed video frame, therefore, using the encoding and decoding method can simplify the encoding and decoding process and improves the compression rate of intra-coded frame.” Page 7 last paragraph and page 8 line 1).
With respect to claim 3, Otsuka, Yang and Pasteris teach the image processing method of claim 2. Yang further teaches wherein one frame of image is selected and stored in the historical frame library at preset intervals, so as to update the historical frame library (“it is necessary to update the reference image library. said reference image library to update into the update time and spatial update, the time update is for updating the reference image according to the application scene set by update time, for example in a live video scene. it can set the updating time is every 10 minutes” page 11 paragraphs 4-5).
With respect to claim 4, Otsuka, Yang and Pasteris teach the image processing method of claim 3. Yang further teaches wherein a frame of image whose image change satisfies a preset requirement is taken as the historical frame image (“updating the reference image according to the detecting result of the application scene; specifically is as follows: periodically detecting the application scene, determining whether to update the reference image library according to the detecting result, if the detection result is that the application scene is changed, triggering the updating” page 11 paragraph 5).
With respect to claim 8, Otsuka teaches an image processing method, comprising: receiving a compressed information packet (see 
    PNG
    media_image1.png
    598
    1000
    media_image1.png
    Greyscale
Translated Figure 26 elements 422, 450, 424 and 200; “the compression-encoded partial image data is transmitted to the image processing device 200 via the communication unit 426” page 45 paragraph 4 lines 1-2  and “Then, compression coding and transmission of image data in the server 400, data reception in the image processing device 200, decoding / decompression, various image processing, and output to the display device are pipelined in units of the partial image.” Page 4 paragraph 2 lines 4-6), wherein the compressed information packet is generated according to a semantic feature set of a current frame image (see 
    PNG
    media_image1.png
    598
    1000
    media_image1.png
    Greyscale
Translated figure 26 elements 422, 450, and 424), the semantic feature set of the current frame image is obtained by performing semantic feature extraction processing on the current frame image (“… the image content acquisition unit 450 determines whether or not the scene is switched, the type of image texture displayed in the frame, the distribution of feature points, depth information, the amount of objects, and the mipmap texture used for three-dimensional graphics. Information such as the usage amount of each level, LOD (Level Of Detail), usage amount of each level of tessellation, amount of characters and symbols, type of scene to be represented, and the like can be obtained from the image generation unit 420.” Page 40 paragraph 4); acquiring the semantic feature set of the current frame image from the compressed information packet (“When the compression coding unit 422 of the server 400 lowers the resolution, the server 400 may also transmit additional data that cannot be specified only by the low resolution image. Here, the additional data includes, for example, a feature amount in the original image generated by the image generation unit 420 and various parameters determined by the compression coding unit 422 at the time of compression coding. The feature amount may include at least one of the feature points of the original image, the edge strength, the depth of each pixel contained in the original image, the type of texture, the optical flow, and the motion estimation information. Alternatively, the additional data may include data indicating an object represented by the original image, which is specified by the object recognition process performed by the compression coding unit 422.” Page 45 paragraph 5 lines 4-5 and paragraph 6 lines 1-7); and performing image reconstruction according to the semantic feature set of the current frame image (“In this case, the decoding / stretching unit 242 of the image processing device 200 accurately generates a high-resolution display image based on the transmitted image data and additional data.” Page 45 paragraph 6 lines 7-8) to obtain a decompressed image corresponding to the current frame image (“…a high-resolution display image based on the transmitted image data and additional data.” Page 45 paragraph 6 lines 7-8).
Otsuka does not teach receiving a compressed information packet, wherein the compressed information packet is generated according to frame number information of a historical frame image, the frame number information is the frame number information of the historical frame image matched with the current frame image and the historical frame image matched with the current frame image is a snapshot corresponding to the current frame image; acquiring the frame number information of the historical frame image from the compressed information packet and acquiring the historical frame image from a historical frame library according to the frame number information of the historical frame image performing image reconstruction according to the historical frame image to obtain a decompressed image corresponding to the current frame image.
Yang teaches receiving a compressed information packet (“receiving the video code stream, and performing standard decoding to the video code stream to obtain the video frame;” page 4 lines 11-12), wherein the compressed information packet is generated according to frame number information of a historical frame image (“receiving the code of the reference image frame and the reference image frame identification information; for decoding the reference image frame, obtaining a reference image;” page 4 lines  18-19) the frame number information is the frame number information of the historical frame image matched with the current frame image (“determining a difference frame and marking information; The coding frame and the difference frame in the frame, determining the residual value; the coding the residual value; outputting the encoded residual value and identification information of the difference frame.” Page 3 lines 12-15); acquiring the frame number information of the historical frame image from the compressed information packet (“receiving the code of the reference image frame and the reference image frame identification information; for decoding the reference image frame, obtaining a reference image;” page 4 lines  18-19) and acquiring the historical frame image from a historical frame library according to the frame number information of the historical frame image (“receiving the code of the reference image frame and the reference image frame identification information; for decoding the reference image frame, obtaining a reference image;” page 4 lines  18-19), and performing image reconstruction according to the historical frame image to obtain a decompressed image corresponding to the current frame image (“The basic idea the encoding and decoding method using inter prediction coding of the original base frame intra-frame prediction coding or other complex code to simplify processing, under the condition of limited of video scene changes, only needs to encode the difference part and there is no need to compress the whole image, and decoding only needs to simply added to the reconstructed video frame, therefore, using the encoding and decoding method can simplify the encoding and decoding process and improves the compression rate of intra-coded frame.” Page 7 last paragraph and page 8 line 1).
Yang is analogous art in the same field of endeavor as the claimed invention. Yang is directed towards video coding and decoding (“The invention relates to the field of video coding and decoding…” page 1 Technical Field line 1). A person of ordinary skill in the art would have found it obvious before the effective filing date of the claimed invention to combine the teachings of Otsuka and Yang by incorporating the refence frame identification coding process of Yang into the feature set packeting process of Otsuka with the expectation that doing so would lead to improvements in the compression rate and the avoidance of other video or repeated image compression (“improving the compression rate, and can solve the problem of repeated coding, repeated compression” page 3 lines 5-6)
Pasteris teaches the historical frame image matched with the current frame image is a snapshot corresponding to the current frame image (“The server then performs additional visual searching through a features matching module 228 that uses the compressed feature descriptors and the matched cluster data and corresponding images to thereby identify reference images in the database 214 that correspond to the image captured by the device 202” page 10 col 5 lines 32-36).
Pasteris is analogous art in the same field of endeavor as the claimed invention. Pasteris is directed towards encoding image features (“In the pipeline 200, an image is captured through an image capture device 202 and a key points detector 204 then identifies or extracts key points in the image and supplies these key points to a clustering module 206. The clustering module 206 then performs a grouping or clustering process on the extracted key points, or the corresponding feature descriptors, to generate clusters of features or feature descriptors” page 9 col 4 lines 44-50). A person of ordinary skill in the art would have found it obvious before the effective filing date of the claimed invention to combine the process of Otsuka, Yang, and Pasteris by utilizing Pasteris’ image matching strategy in place of Yang’s with the expectation that doing so would enable the system to more accurately and efficiently find corresponding images during the matching process (“Embodiments of the present disclosure relate generally to visual search systems and, more specifically to systems, circuits, and methods that group image feature descriptors of a captured scene into clusters to improve matching with reference images and the efficiency of transmission of such image feature descriptors.” Page 8 Technical Field).
With respect to claim 9, Otsuka, Yang and Pasteris teach all of the limitations in consideration of its parent, claim 1. Otsuka further teaches a non-transitory computer-readable storage medium (“various memories” page 6 paragraph 4 line 3) having stored thereon an image processing program (“in terms of software, an information processing function and an image loaded into memory from a recording medium.” Page 6 paragraph 4 lines 3-4), which, when executed by a processor (“FIG. 5 shows the functional blocks of the server 400 and the image processing device 200 of this embodiment. Each functional block shown in the figure can be realized by a CPU” page 6 paragraph 4 lines 1-2), causes the processor to perform the image processing method of claim 1 (“FIG. 5 shows the functional blocks of the server 400 and the image processing device 200 of this embodiment. Each functional block shown in the figure can be realized by a CPU, … various memories, etc. in terms of hardware, and in terms of software, an information processing function and an image loaded into memory from a recording medium. It is realized by a program that exerts various functions such as drawing function, data input / output function, and communication function” page 6 paragraph 4 lines 1-6).
With respect to claim 10, Otsuka, Yang and Pasteris teach all the claim limitations in consideration of its parent, claim 1. Otsuka further teaches an electronic device (“FIG. 5 shows the functional blocks of the server 400 and the image processing device 200 of this embodiment. Each functional block shown in the figure can be realized by a CPU, GPU, encoder, decoder, arithmetic unit, various memories, etc. in terms of hardware …” page 6 paragraph 4 lines 1-3), comprising a memory (“various memories” page 6 paragraph 4 line 3), a processor (“FIG. 5 shows the functional blocks of the server 400 and the image processing device 200 of this embodiment. Each functional block shown in the figure can be realized by a CPU” page 6 paragraph 4 lines 1-2), and an image processing program which is stored on the memory (“in terms of software, an information processing function and an image loaded into memory from a recording medium.” Page 6 paragraph 4 lines 3-4) and capable of running on the processor (FIG. 5 shows the functional blocks of the server 400 and the image processing device 200 of this embodiment. Each functional block shown in the figure can be realized by a CPU, … various memories, etc. in terms of hardware, and in terms of software, an information processing function and an image loaded into memory from a recording medium. It is realized by a program that exerts various functions such as drawing function, data input / output function, and communication function” page 6 paragraph 4 lines 1-6), wherein when the processor executes the image processing program, the image processing method of claim 1 is preformed (FIG. 5 shows the functional blocks of the server 400 and the image processing device 200 of this embodiment. Each functional block shown in the figure can be realized by a CPU, … various memories, etc. in terms of hardware, and in terms of software, an information processing function and an image loaded into memory from a recording medium. It is realized by a program that exerts various functions such as drawing function, data input / output function, and communication function” page 6 paragraph 4 lines 1-6).
With respect to claim 18, Otsuka, Yang and Pasteris teach all limitations in consideration of claim 8. Otsuka further teaches a non-transitory computer-readable storage medium (“various memories” page 6 paragraph 4 line 3) having stored thereon an image processing program (“in terms of software, an information processing function and an image loaded into memory from a recording medium.” Page 6 paragraph 4 lines 3-4), which, when executed by a processor (“FIG. 5 shows the functional blocks of the server 400 and the image processing device 200 of this embodiment. Each functional block shown in the figure can be realized by a CPU” page 6 paragraph 4 lines 1-2), causes the processor to perform the image processing method of claim 8 (“FIG. 5 shows the functional blocks of the server 400 and the image processing device 200 of this embodiment. Each functional block shown in the figure can be realized by a CPU, … various memories, etc. in terms of hardware, and in terms of software, an information processing function and an image loaded into memory from a recording medium. It is realized by a program that exerts various functions such as drawing function, data input / output function, and communication function” page 6 paragraph 4 lines 1-6).
With respect to claim 19, Otsuka, Yang and Pasteris teach all limitations in consideration of claim 8. Otsuka further teaches an electronic device (“FIG. 5 shows the functional blocks of the server 400 and the image processing device 200 of this embodiment. Each functional block shown in the figure can be realized by a CPU, GPU, encoder, decoder, arithmetic unit, various memories, etc. in terms of hardware …” page 6 paragraph 4 lines 1-3), comprising a memory (“various memories” page 6 paragraph 4 line 3), a processor (“FIG. 5 shows the functional blocks of the server 400 and the image processing device 200 of this embodiment. Each functional block shown in the figure can be realized by a CPU” page 6 paragraph 4 lines 1-2), and an image processing program which is stored on the memory (“in terms of software, an information processing function and an image loaded into memory from a recording medium.” Page 6 paragraph 4 lines 3-4) and capable of running on the processor (FIG. 5 shows the functional blocks of the server 400 and the image processing device 200 of this embodiment. Each functional block shown in the figure can be realized by a CPU, … various memories, etc. in terms of hardware, and in terms of software, an information processing function and an image loaded into memory from a recording medium. It is realized by a program that exerts various functions such as drawing function, data input / output function, and communication function” page 6 paragraph 4 lines 1-6), wherein when the processor executes the image processing program, the image processing method of claim 8 is performed (FIG. 5 shows the functional blocks of the server 400 and the image processing device 200 of this embodiment. Each functional block shown in the figure can be realized by a CPU, … various memories, etc. in terms of hardware, and in terms of software, an information processing function and an image loaded into memory from a recording medium. It is realized by a program that exerts various functions such as drawing function, data input / output function, and communication function” page 6 paragraph 4 lines 1-6).

Claims 5, 12, and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Otsuka, Yang and Pasteris as applied to claims 2, 3 and 4 above, and further in view of Jacobs (US 20110047384 A1).
With respect to claim 5, Otsuka, Yang and Pasteris teach the image processing method of claim 2. However, they do not explicitly teach the rest of the claim limitations. Jacobs teaches a case where the current frame image is an image containing a person (“In such embodiments the facial identification program is used to make identifications among this limited set, e.g., people within room, in a conference meeting space” in paragraph 0022 see also examples 2 “facial identity”, 4 “guest” and 5 “employee”), performing the semantic feature extraction processing on the current frame image comprises: detecting person in the current frame image (“In such embodiments the facial identification program is used to make identifications among this limited set, e.g., people within room, in a conference meeting space” in paragraph 0022 see also examples 2 “facial identity”, 4 “guest” and 5 “employee” ), and acquiring Identity Document (ID) information of at least one person (“In some embodiments, once a face has been identified, a request for confirmation can be broadcast together with a key, pursuant to building a network. The request may, for instance, identify the requester and ask for confirmation by way of furnishing the name of the person responding to the request or the device address of the person's device.” In paragraph 0022 see also examples 4 “badge” and 5 “a camera grabs the employee's picture (step 704), and uses it to find the employees phone or wireless badge); recognizing a person-related attribute of the current frame image (“facial identification” in paragraph 0022  see also examples 2 “facial identity”, 4 “biometric data observed” and 5 “employee’s picture”) to obtain feature information of the at least one person (“facial identification” in paragraph 0022 see also examples 2 “facial identity”, 4 “biometric data observed” and 5 “employee’s picture”); and encoding the feature information of the at least one person (“feature tags” in paragraph 0022 and “For instance, facial characteristics may be input into a file referred to as a feature tag. Comparing feature tag characteristics with those gleaned from the photograph information taken from the mobile device may serve as a primary basis for making the facial identification.” Paragraph 0020 and “The captured image may be reduced to information input to a feature tag and that feature tag information may be compared with the feature tag information corresponding to entries in the database.” In paragraph 0034), and generating the semantic feature set of the current frame image according to an encoding result and the ID information of the at least one person (“the feature tag information may be broadcast including name information of the user of a device” in paragraph 0022). 
Jacobs is analogous art in the same field of endeavor as the claimed invention. Jacobs is directed towards detecting features related to people and acquiring ID document information (“Several approaches to facial identification are contemplated. These include knowledge-based methods which encode facial features according to rules applied based on the typical face; template matching methods which match images to those from a catalog of stored facial images or features; appearance based methods which develop models for comparison based on training images; and feature invariant models which use algorithms to discover facial features even though the view and pose of the subject and/or lighting conditions change.” Paragraph 0011 and Example 5 “a camera grabs the employee's picture (step 704), and uses it to find the employees phone or wireless badge” ).  A person of ordinary skill in the art before the effective filing date of the claimed invention would have found it obvious to  combine Otsuka, Yang, Pasteris and Jacobs by utilizing the person-based identification methodology of Jacobs inside the semantic feature-based compression system of Otsuka, Yang and Pasteris, treating the identified person-based features as further semantic features, with the expectation that doing so would lead to the system being able to perform highly desired people recognition based tasks while not introducing intense performance requirements (“A method of identifying faces of those from a group of people in an organization, particularly for social networking, workplace functions, etc. would be highly desirable. While many face recognition algorithms exist, a limiting aspect of most is the amount of time and computing power required to run the algorithm given all of the characteristics for comparison under consideration….Nonetheless, face recognition programs can be very useful when properly matched with an application providing an economy of scale for uses better suited to exploit or synergize the face recognition capability benefits. The following accomplishes this task.” Paragraph 0001). 
With respect to claim 12, Otsuka, Yang and Pasteris teach the image processing method of claim 3. However, they do not explicitly teach the rest of the claim limitations. Jacobs teaches a case where the current frame image is an image containing a person (“In such embodiments the facial identification program is used to make identifications among this limited set, e.g., people within room, in a conference meeting space” in paragraph 0022 see also examples 2 “facial identity”, 4 “guest” and 5 “employee”), performing the semantic feature extraction processing on the current frame image comprises: detecting person in the current frame image (“In such embodiments the facial identification program is used to make identifications among this limited set, e.g., people within room, in a conference meeting space” in paragraph 0022 see also examples 2 “facial identity”, 4 “guest” and 5 “employee” ), and acquiring Identity Document (ID) information of at least one person (“In some embodiments, once a face has been identified, a request for confirmation can be broadcast together with a key, pursuant to building a network. The request may, for instance, identify the requester and ask for confirmation by way of furnishing the name of the person responding to the request or the device address of the person's device.” In paragraph 0022 see also examples 4 “badge” and 5 “a camera grabs the employee's picture (step 704), and uses it to find the employees phone or wireless badge); recognizing a person-related attribute of the current frame image (“facial identification” in paragraph 0022  see also examples 2 “facial identity”, 4 “biometric data observed” and 5 “employee’s picture”) to obtain feature information of the at least one person (“facial identification” in paragraph 0022 see also examples 2 “facial identity”, 4 “biometric data observed” and 5 “employee’s picture”); and encoding the feature information of the at least one person (“feature tags” in paragraph 0022 and “For instance, facial characteristics may be input into a file referred to as a feature tag. Comparing feature tag characteristics with those gleaned from the photograph information taken from the mobile device may serve as a primary basis for making the facial identification.” Paragraph 0020 and “The captured image may be reduced to information input to a feature tag and that feature tag information may be compared with the feature tag information corresponding to entries in the database.” In paragraph 0034), and generating the semantic feature set of the current frame image according to an encoding result and the ID information of the at least one person (“the feature tag information may be broadcast including name information of the user of a device” in paragraph 0022). 
Jacobs is analogous art in the same field of endeavor as the claimed invention. Jacobs is directed towards detecting features related to people and acquiring ID document information (“Several approaches to facial identification are contemplated. These include knowledge-based methods which encode facial features according to rules applied based on the typical face; template matching methods which match images to those from a catalog of stored facial images or features; appearance based methods which develop models for comparison based on training images; and feature invariant models which use algorithms to discover facial features even though the view and pose of the subject and/or lighting conditions change.” Paragraph 0011 and Example 5 “a camera grabs the employee's picture (step 704), and uses it to find the employees phone or wireless badge” ).  A person of ordinary skill in the art before the effective filing date of the claimed invention would have found it obvious to  combine Otsuka, Yang, Pasteris and Jacobs by utilizing the person-based identification methodology of Jacobs inside the semantic feature-based compression system of Otsuka, Yang and Pasteris, treating the identified person-based features as further semantic features, with the expectation that doing so would lead to the system being able to perform highly desired people recognition based tasks while not introducing intense performance requirements (“A method of identifying faces of those from a group of people in an organization, particularly for social networking, workplace functions, etc. would be highly desirable. While many face recognition algorithms exist, a limiting aspect of most is the amount of time and computing power required to run the algorithm given all of the characteristics for comparison under consideration….Nonetheless, face recognition programs can be very useful when properly matched with an application providing an economy of scale for uses better suited to exploit or synergize the face recognition capability benefits. The following accomplishes this task.” Paragraph 0001). 
With respect to claim 15, Otsuka, Yang and Pasteris teach the image processing method of claim 4. However, they do not explicitly teach the rest of the claim limitations. Jacobs teaches a case where the current frame image is an image containing a person (“In such embodiments the facial identification program is used to make identifications among this limited set, e.g., people within room, in a conference meeting space” in paragraph 0022 see also examples 2 “facial identity”, 4 “guest” and 5 “employee”), performing the semantic feature extraction processing on the current frame image comprises: detecting person in the current frame image (“In such embodiments the facial identification program is used to make identifications among this limited set, e.g., people within room, in a conference meeting space” in paragraph 0022 see also examples 2 “facial identity”, 4 “guest” and 5 “employee” ), and acquiring Identity Document (ID) information of at least one person (“In some embodiments, once a face has been identified, a request for confirmation can be broadcast together with a key, pursuant to building a network. The request may, for instance, identify the requester and ask for confirmation by way of furnishing the name of the person responding to the request or the device address of the person's device.” In paragraph 0022 see also examples 4 “badge” and 5 “a camera grabs the employee's picture (step 704), and uses it to find the employees phone or wireless badge); recognizing a person-related attribute of the current frame image (“facial identification” in paragraph 0022  see also examples 2 “facial identity”, 4 “biometric data observed” and 5 “employee’s picture”) to obtain feature information of the at least one person (“facial identification” in paragraph 0022 see also examples 2 “facial identity”, 4 “biometric data observed” and 5 “employee’s picture”); and encoding the feature information of the at least one person (“feature tags” in paragraph 0022 and “For instance, facial characteristics may be input into a file referred to as a feature tag. Comparing feature tag characteristics with those gleaned from the photograph information taken from the mobile device may serve as a primary basis for making the facial identification.” Paragraph 0020 and “The captured image may be reduced to information input to a feature tag and that feature tag information may be compared with the feature tag information corresponding to entries in the database.” In paragraph 0034), and generating the semantic feature set of the current frame image according to an encoding result and the ID information of the at least one person (“the feature tag information may be broadcast including name information of the user of a device” in paragraph 0022). 
Jacobs is analogous art in the same field of endeavor as the claimed invention. Jacobs is directed towards detecting features related to people and acquiring ID document information (“Several approaches to facial identification are contemplated. These include knowledge-based methods which encode facial features according to rules applied based on the typical face; template matching methods which match images to those from a catalog of stored facial images or features; appearance based methods which develop models for comparison based on training images; and feature invariant models which use algorithms to discover facial features even though the view and pose of the subject and/or lighting conditions change.” Paragraph 0011 and Example 5 “a camera grabs the employee's picture (step 704), and uses it to find the employees phone or wireless badge” ).  A person of ordinary skill in the art before the effective filing date of the claimed invention would have found it obvious to  combine Otsuka, Yang, Pasteris and Jacobs by utilizing the person-based identification methodology of Jacobs inside the semantic feature-based compression system of Otsuka, Yang and Pasteris, treating the identified person-based features as further semantic features, with the expectation that doing so would lead to the system being able to perform highly desired people recognition based tasks while not introducing intense performance requirements (“A method of identifying faces of those from a group of people in an organization, particularly for social networking, workplace functions, etc. would be highly desirable. While many face recognition algorithms exist, a limiting aspect of most is the amount of time and computing power required to run the algorithm given all of the characteristics for comparison under consideration….Nonetheless, face recognition programs can be very useful when properly matched with an application providing an economy of scale for uses better suited to exploit or synergize the face recognition capability benefits. The following accomplishes this task.” Paragraph 0001). 
Claims 6-7, 13-14, and 16-17 are rejected under 35 U.S.C. 103 as being unpatentable over  Otsuka, Yang, Pasteris, and Jacobs as applied to claims 5, 12, and 15 above, and further in view of Ma (CN 109522912 A).
With respect to claim 6,  Otsuka, Yang, Pasteris, and Jacobs teach the image processing method of claim 5.  Otsuka, Yang, Pasteris, and Jacobs do not explicitly teach the additional limitations. Ma teaches wherein the feature information of the person comprises at least one of skeleton and outline information (“plurality of position of articulation determination of detected human body contour, according to human body outline emphasis is the detection position of the part” page 2 summary of the invention paragraph 3 lines 3-4 and “Optionally, according to dynamic image data of detected person obtains positions of the nodes comprises: a dynamic video image of the detected people for joint analysis, obtaining the position of the node.” page 3 paragraph 1), pose information (“Optionally, one or more key detection part comprises a head, shoulder, chest, hands, wrist, hip, armpit, crotch, waist, abdomen, foot … Alternatively, the articulation comprises a head, a right shoulder, the left shoulder, shoulder, right elbow, left elbow, right wrist, left wrist, right hip, , right knee, left, right ankle, embrace one or more” page 2 paragraphs 4 and 5, and “Optionally, according to dynamic image data of detected person obtains positions of the nodes comprises: a dynamic video image of the detected people for joint analysis, obtaining the position of the node.” page 3 paragraph 1), and head angle information (“Optionally, one or more key detection part comprises a head, shoulder, chest, hands, wrist, hip, armpit, crotch, waist, abdomen, foot … Alternatively, the articulation comprises a head, a right shoulder, the left shoulder, shoulder, right elbow, left elbow, right wrist, left wrist, right hip, , right knee, left, right ankle, embrace one or more”  page 2 paragraphs 4 and 5, and “Optionally, according to dynamic image data of detected person obtains positions of the nodes comprises: a dynamic video image of the detected people for joint analysis, obtaining the position of the node.” page 3 paragraph 1) of the person.
Ma is analogous art in the same field of endeavor the claimed invention. Ma is directed towards detection of people and their various body parts (“plurality of position of articulation determination of detected human body contour, according to human body outline emphasis is the detection position of the part” page 2 summary of the invention paragraph 3 lines 3-4, and “Optionally, according to dynamic image data of detected person obtains positions of the nodes comprises: a dynamic video image of the detected people for joint analysis, obtaining the position of the node.” page 3 paragraph 1).  A person of ordinary skill in the art before the effective filing date of the claimed invention would have found it obvious to combine  Otsuka, Yang, Pasteris, Jacobs and Ma by utilizing the human pose, outline, and articulation detection methodology of Ma inside the combined system of  Otsuka, Yang, Pasteris, and Jacobs, treating these pose and shaped based human features as further sematic features, with the expectation that doing so would lead to quality improvements in recognition and inspection quality of human subjects (“realize effective evaluation of the inspection personnel working quality, which reaches the aim of improving the inspection personnel working quality and improves the safety of the validity and security of the security” page 4 paragraph 11 lines 5-6). 
With respect to claim 7,  Otsuka, Yang, Pasteris, Jacobs, and Ma teach the processing method of claim 6. Otsuka teaches performing the image reconstruction according to the semantic feature set of the current frame image (“In this case, the decoding / stretching unit 242 of the image processing device 200 accurately generates a high-resolution display image based on the transmitted image data and additional data.” Page 45 paragraph 6 lines 7-8) comprises: determining the feature information (“When the compression coding unit 422 of the server 400 lowers the resolution, the server 400 may also transmit additional data that cannot be specified only by the low resolution image. Here, the additional data includes, for example, a feature amount in the original image generated by the image generation unit 420 and various parameters determined by the compression coding unit 422 at the time of compression coding. The feature amount may include at least one of the feature points of the original image, the edge strength, the depth of each pixel contained in the original image, the type of texture, the optical flow, and the motion estimation information. Alternatively, the additional data may include data indicating an object represented by the original image, which is specified by the object recognition process performed by the compression coding unit 422.” Page 45 paragraph 5 lines 4-5 and paragraph 6 lines 1-7) and generating, according to the feature information an image (“…a high-resolution display image based on the transmitted image data and additional data.” Page 45 paragraph 6 lines 7-8) and generating the decompressed image according to the feature information (“In this case, the decoding / stretching unit 242 of the image processing device 200 accurately generates a high-resolution display image based on the transmitted image data and additional data.” Page 45 paragraph 6 lines 7-8), by adopting a generation network (“It is desirable to optimize the above-mentioned decision rules themselves. Therefore, the decision rule may be optimized by machine learning or deep learning while collecting the adjustment results in various past cases. When machine learning is performed here, the object of optimization may be a table in which decision rules are defined or a calculation model. For deep learning, optimize the computational model. In these learning techniques, for example, a score database created manually and a gameplay experience by a user are used as teacher data. In addition, learning is performed by using the case of subjective drawing as a constraint condition of the calculation model, PSNR (Peak Signal-to-Noise Ratio) indicating image quality, SSIM (Structural Similarity), parameter switching frequency, time series smoothness, etc. as indicators” page 44 paragraph 6 and 7). Otsuka does not explicitly disclose the use of a historical frame or person specific features; however, it does mention people as an object within a frame that can be extracted, compressed, and later decompressed as a part of a reconstructed image (“Objects on the screen that reflect user operations are objects that the user operates, such as people” page 44 paragraph 1 and “As a general rule, information indicating the relationship between the user's operation content and the objects on the screen is acquired from the image generation unit 420. The image content acquisition unit 450 may infer the information.” Page 44 paragraph 2), using a generation network as described above (human image generator network).  
Yang teaches performing the image reconstruction according to the historical frame image of the current frame image  (“The basic idea the encoding and decoding method using inter prediction coding of the original base frame intra-frame prediction coding or other complex code to simplify processing, under the condition of limited of video scene changes, only needs to encode the difference part and there is no need to compress the whole image, and decoding only needs to simply added to the reconstructed video frame, therefore, using the encoding and decoding method can simplify the encoding and decoding process and improves the compression rate of intra-coded frame.” Page 7 last paragraph and page 8 line 1) comprises generating the decompressed image according to the historical frame image  (“The basic idea the encoding and decoding method using inter prediction coding of the original base frame intra-frame prediction coding or other complex code to simplify processing, under the condition of limited of video scene changes, only needs to encode the difference part and there is no need to compress the whole image, and decoding only needs to simply added to the reconstructed video frame, therefore, using the encoding and decoding method can simplify the encoding and decoding process and improves the compression rate of intra-coded frame.” Page 7 last paragraph and page 8 line 1). 
Jacobs teaches determining the feature information of the at least one person according to the ID information of the at least one person (“devices may periodically broadcast their feature tag information on the discovery channel, and in some embodiments, thereby forming the database for an image to be compared against. In some embodiments, the feature tag information may be broadcast including name information of the user of a device. In such embodiments the facial identification program is used to make identifications among this limited set, e.g., people within room, in a conference meeting space, etc.” in paragraph 0022 see also example 4 “During step 604 a photo (or video or other biometric data) is secured/taken from/of the subject. At 606, a decision is made as to whether there's a match between the biometric data observed or taken from the visitor and database information. For instance, biometric data from a subject may be collected in advance in forming a database.”), and generating, according to the feature information of the at least one person, an image of the at least one person (“devices may periodically broadcast their feature tag information on the discovery channel, and in some embodiments, thereby forming the database for an image to be compared against…. Should the facial image and the feature tag information match” in paragraph 0022, with feature tags as generated images) 
Ma teaches outline information as a semantic image feature (“plurality of position of articulation determination of detected human body contour, according to human body outline emphasis is the detection position of the part” page 2 summary of the invention paragraph 3 lines 3-4 and “Optionally, according to dynamic image data of detected person obtains positions of the nodes comprises: a dynamic video image of the detected people for joint analysis, obtaining the position of the node.” page 3 paragraph 1)
With respect to claim 13, Otsuka, Yang, Pasteris, and Jacobs teach the processing method of claim 12. Otsuka, Yang, Pasteris, and Jacobs do not explicitly teach the additional limitations. Ma teaches wherein the feature information of the person comprises at least one of skeleton and outline information (“plurality of position of articulation determination of detected human body contour, according to human body outline emphasis is the detection position of the part” page 2 summary of the invention paragraph 3 lines 3-4 and “Optionally, according to dynamic image data of detected person obtains positions of the nodes comprises: a dynamic video image of the detected people for joint analysis, obtaining the position of the node.” page 3 paragraph 1), pose information (“Optionally, one or more key detection part comprises a head, shoulder, chest, hands, wrist, hip, armpit, crotch, waist, abdomen, foot … Alternatively, the articulation comprises a head, a right shoulder, the left shoulder, shoulder, right elbow, left elbow, right wrist, left wrist, right hip, , right knee, left, right ankle, embrace one or more” page 2 paragraphs 4 and 5, and “Optionally, according to dynamic image data of detected person obtains positions of the nodes comprises: a dynamic video image of the detected people for joint analysis, obtaining the position of the node.” page 3 paragraph 1), and head angle information (“Optionally, one or more key detection part comprises a head, shoulder, chest, hands, wrist, hip, armpit, crotch, waist, abdomen, foot … Alternatively, the articulation comprises a head, a right shoulder, the left shoulder, shoulder, right elbow, left elbow, right wrist, left wrist, right hip, , right knee, left, right ankle, embrace one or more”  page 2 paragraphs 4 and 5, and “Optionally, according to dynamic image data of detected person obtains positions of the nodes comprises: a dynamic video image of the detected people for joint analysis, obtaining the position of the node.” page 3 paragraph 1) of the person.
Ma is analogous art in the same field of endeavor the claimed invention. Ma is directed towards detection of people and their various body parts (“plurality of position of articulation determination of detected human body contour, according to human body outline emphasis is the detection position of the part” page 2 summary of the invention paragraph 3 lines 3-4, and “Optionally, according to dynamic image data of detected person obtains positions of the nodes comprises: a dynamic video image of the detected people for joint analysis, obtaining the position of the node.” page 3 paragraph 1).  A person of ordinary skill in the art before the effective filing date of the claimed invention would have found it obvious to combine  Otsuka, Yang, Pasteris, Jacobs and Ma by utilizing the human pose, outline, and articulation detection methodology of Ma inside the combined system of  Otsuka, Yang, Pasteris, and Jacobs, treating these pose and shaped based human features as further sematic features, with the expectation that doing so would lead to quality improvements in recognition and inspection quality of human subjects (“realize effective evaluation of the inspection personnel working quality, which reaches the aim of improving the inspection personnel working quality and improves the safety of the validity and security of the security” page 4 paragraph 11 lines 5-6). 
With respect to claim 14, Otsuka, Yang, Pasteris, Jacobs, and Ma teach the processing method of claim 13. Otsuka teaches performing the image reconstruction according to the semantic feature set of the current frame image (“In this case, the decoding / stretching unit 242 of the image processing device 200 accurately generates a high-resolution display image based on the transmitted image data and additional data.” Page 45 paragraph 6 lines 7-8) comprises: determining the feature information (“When the compression coding unit 422 of the server 400 lowers the resolution, the server 400 may also transmit additional data that cannot be specified only by the low resolution image. Here, the additional data includes, for example, a feature amount in the original image generated by the image generation unit 420 and various parameters determined by the compression coding unit 422 at the time of compression coding. The feature amount may include at least one of the feature points of the original image, the edge strength, the depth of each pixel contained in the original image, the type of texture, the optical flow, and the motion estimation information. Alternatively, the additional data may include data indicating an object represented by the original image, which is specified by the object recognition process performed by the compression coding unit 422.” Page 45 paragraph 5 lines 4-5 and paragraph 6 lines 1-7) and generating, according to the feature information an image (“…a high-resolution display image based on the transmitted image data and additional data.” Page 45 paragraph 6 lines 7-8) and generating the decompressed image according to the feature information (“In this case, the decoding / stretching unit 242 of the image processing device 200 accurately generates a high-resolution display image based on the transmitted image data and additional data.” Page 45 paragraph 6 lines 7-8), by adopting a generation network (“It is desirable to optimize the above-mentioned decision rules themselves. Therefore, the decision rule may be optimized by machine learning or deep learning while collecting the adjustment results in various past cases. When machine learning is performed here, the object of optimization may be a table in which decision rules are defined or a calculation model. For deep learning, optimize the computational model. In these learning techniques, for example, a score database created manually and a gameplay experience by a user are used as teacher data. In addition, learning is performed by using the case of subjective drawing as a constraint condition of the calculation model, PSNR (Peak Signal-to-Noise Ratio) indicating image quality, SSIM (Structural Similarity), parameter switching frequency, time series smoothness, etc. as indicators” page 44 paragraph 6 and 7). Otsuka does not explicitly disclose the use of a historical frame or person specific features; however, it does mention people as an object within a frame that can be extracted, compressed, and later decompressed as a part of a reconstructed image (“Objects on the screen that reflect user operations are objects that the user operates, such as people” page 44 paragraph 1 and “As a general rule, information indicating the relationship between the user's operation content and the objects on the screen is acquired from the image generation unit 420. The image content acquisition unit 450 may infer the information.” Page 44 paragraph 2), using a generation network as described above (human image generator network).  
Yang teaches performing the image reconstruction according to the historical frame image of the current frame image  (“The basic idea the encoding and decoding method using inter prediction coding of the original base frame intra-frame prediction coding or other complex code to simplify processing, under the condition of limited of video scene changes, only needs to encode the difference part and there is no need to compress the whole image, and decoding only needs to simply added to the reconstructed video frame, therefore, using the encoding and decoding method can simplify the encoding and decoding process and improves the compression rate of intra-coded frame.” Page 7 last paragraph and page 8 line 1) comprises generating the decompressed image according to the historical frame image  (“The basic idea the encoding and decoding method using inter prediction coding of the original base frame intra-frame prediction coding or other complex code to simplify processing, under the condition of limited of video scene changes, only needs to encode the difference part and there is no need to compress the whole image, and decoding only needs to simply added to the reconstructed video frame, therefore, using the encoding and decoding method can simplify the encoding and decoding process and improves the compression rate of intra-coded frame.” Page 7 last paragraph and page 8 line 1). 
Jacobs teaches determining the feature information of the at least one person according to the ID information of the at least one person (“devices may periodically broadcast their feature tag information on the discovery channel, and in some embodiments, thereby forming the database for an image to be compared against. In some embodiments, the feature tag information may be broadcast including name information of the user of a device. In such embodiments the facial identification program is used to make identifications among this limited set, e.g., people within room, in a conference meeting space, etc.” in paragraph 0022 see also example 4 “During step 604 a photo (or video or other biometric data) is secured/taken from/of the subject. At 606, a decision is made as to whether there's a match between the biometric data observed or taken from the visitor and database information. For instance, biometric data from a subject may be collected in advance in forming a database.”), and generating, according to the feature information of the at least one person, an image of the at least one person (“devices may periodically broadcast their feature tag information on the discovery channel, and in some embodiments, thereby forming the database for an image to be compared against…. Should the facial image and the feature tag information match” in paragraph 0022, with feature tags as generated images) 
Ma teaches outline information as a semantic image feature (“plurality of position of articulation determination of detected human body contour, according to human body outline emphasis is the detection position of the part” page 2 summary of the invention paragraph 3 lines 3-4 and “Optionally, according to dynamic image data of detected person obtains positions of the nodes comprises: a dynamic video image of the detected people for joint analysis, obtaining the position of the node.” page 3 paragraph 1)
With respect to claim 16, Otsuka, Yang, Pasteris, and Jacobs teach the image processing method of claim 15. Otsuka, Yang, Pasteris, and Jacobs do not explicitly teach the additional limitations. Ma teaches wherein the feature information of the person comprises at least one of skeleton and outline information (“plurality of position of articulation determination of detected human body contour, according to human body outline emphasis is the detection position of the part” page 2 summary of the invention paragraph 3 lines 3-4 and “Optionally, according to dynamic image data of detected person obtains positions of the nodes comprises: a dynamic video image of the detected people for joint analysis, obtaining the position of the node.” page 3 paragraph 1), pose information (“Optionally, one or more key detection part comprises a head, shoulder, chest, hands, wrist, hip, armpit, crotch, waist, abdomen, foot … Alternatively, the articulation comprises a head, a right shoulder, the left shoulder, shoulder, right elbow, left elbow, right wrist, left wrist, right hip, , right knee, left, right ankle, embrace one or more” page 2 paragraphs 4 and 5, and “Optionally, according to dynamic image data of detected person obtains positions of the nodes comprises: a dynamic video image of the detected people for joint analysis, obtaining the position of the node.” page 3 paragraph 1), and head angle information (“Optionally, one or more key detection part comprises a head, shoulder, chest, hands, wrist, hip, armpit, crotch, waist, abdomen, foot … Alternatively, the articulation comprises a head, a right shoulder, the left shoulder, shoulder, right elbow, left elbow, right wrist, left wrist, right hip, , right knee, left, right ankle, embrace one or more”  page 2 paragraphs 4 and 5, and “Optionally, according to dynamic image data of detected person obtains positions of the nodes comprises: a dynamic video image of the detected people for joint analysis, obtaining the position of the node.” page 3 paragraph 1) of the person.
Ma is analogous art in the same field of endeavor the claimed invention. Ma is directed towards detection of people and their various body parts (“plurality of position of articulation determination of detected human body contour, according to human body outline emphasis is the detection position of the part” page 2 summary of the invention paragraph 3 lines 3-4, and “Optionally, according to dynamic image data of detected person obtains positions of the nodes comprises: a dynamic video image of the detected people for joint analysis, obtaining the position of the node.” page 3 paragraph 1).  A person of ordinary skill in the art before the effective filing date of the claimed invention would have found it obvious to combine  Otsuka, Yang, Pasteris, Jacobs and Ma by utilizing the human pose, outline, and articulation detection methodology of Ma inside the combined system of  Otsuka, Yang, Pasteris, and Jacobs, treating these pose and shaped based human features as further sematic features, with the expectation that doing so would lead to quality improvements in recognition and inspection quality of human subjects (“realize effective evaluation of the inspection personnel working quality, which reaches the aim of improving the inspection personnel working quality and improves the safety of the validity and security of the security” page 4 paragraph 11 lines 5-6). 
With respect to claim 17, Otsuka, Yang, Pasteris, Jacobs, and Ma teach the processing method of claim 16. Otsuka teaches performing the image reconstruction according to the semantic feature set of the current frame image (“In this case, the decoding / stretching unit 242 of the image processing device 200 accurately generates a high-resolution display image based on the transmitted image data and additional data.” Page 45 paragraph 6 lines 7-8) comprises: determining the feature information (“When the compression coding unit 422 of the server 400 lowers the resolution, the server 400 may also transmit additional data that cannot be specified only by the low resolution image. Here, the additional data includes, for example, a feature amount in the original image generated by the image generation unit 420 and various parameters determined by the compression coding unit 422 at the time of compression coding. The feature amount may include at least one of the feature points of the original image, the edge strength, the depth of each pixel contained in the original image, the type of texture, the optical flow, and the motion estimation information. Alternatively, the additional data may include data indicating an object represented by the original image, which is specified by the object recognition process performed by the compression coding unit 422.” Page 45 paragraph 5 lines 4-5 and paragraph 6 lines 1-7) and generating, according to the feature information an image (“…a high-resolution display image based on the transmitted image data and additional data.” Page 45 paragraph 6 lines 7-8) and generating the decompressed image according to the feature information (“In this case, the decoding / stretching unit 242 of the image processing device 200 accurately generates a high-resolution display image based on the transmitted image data and additional data.” Page 45 paragraph 6 lines 7-8), by adopting a generation network (“It is desirable to optimize the above-mentioned decision rules themselves. Therefore, the decision rule may be optimized by machine learning or deep learning while collecting the adjustment results in various past cases. When machine learning is performed here, the object of optimization may be a table in which decision rules are defined or a calculation model. For deep learning, optimize the computational model. In these learning techniques, for example, a score database created manually and a gameplay experience by a user are used as teacher data. In addition, learning is performed by using the case of subjective drawing as a constraint condition of the calculation model, PSNR (Peak Signal-to-Noise Ratio) indicating image quality, SSIM (Structural Similarity), parameter switching frequency, time series smoothness, etc. as indicators” page 44 paragraph 6 and 7). Otsuka does not explicitly disclose the use of a historical frame or person specific features; however, it does mention people as an object within a frame that can be extracted, compressed, and later decompressed as a part of a reconstructed image (“Objects on the screen that reflect user operations are objects that the user operates, such as people” page 44 paragraph 1 and “As a general rule, information indicating the relationship between the user's operation content and the objects on the screen is acquired from the image generation unit 420. The image content acquisition unit 450 may infer the information.” Page 44 paragraph 2), using a generation network as described above (human image generator network).  
Yang teaches performing the image reconstruction according to the historical frame image of the current frame image  (“The basic idea the encoding and decoding method using inter prediction coding of the original base frame intra-frame prediction coding or other complex code to simplify processing, under the condition of limited of video scene changes, only needs to encode the difference part and there is no need to compress the whole image, and decoding only needs to simply added to the reconstructed video frame, therefore, using the encoding and decoding method can simplify the encoding and decoding process and improves the compression rate of intra-coded frame.” Page 7 last paragraph and page 8 line 1) comprises generating the decompressed image according to the historical frame image  (“The basic idea the encoding and decoding method using inter prediction coding of the original base frame intra-frame prediction coding or other complex code to simplify processing, under the condition of limited of video scene changes, only needs to encode the difference part and there is no need to compress the whole image, and decoding only needs to simply added to the reconstructed video frame, therefore, using the encoding and decoding method can simplify the encoding and decoding process and improves the compression rate of intra-coded frame.” Page 7 last paragraph and page 8 line 1). 
Jacobs teaches determining the feature information of the at least one person according to the ID information of the at least one person (“devices may periodically broadcast their feature tag information on the discovery channel, and in some embodiments, thereby forming the database for an image to be compared against. In some embodiments, the feature tag information may be broadcast including name information of the user of a device. In such embodiments the facial identification program is used to make identifications among this limited set, e.g., people within room, in a conference meeting space, etc.” in paragraph 0022 see also example 4 “During step 604 a photo (or video or other biometric data) is secured/taken from/of the subject. At 606, a decision is made as to whether there's a match between the biometric data observed or taken from the visitor and database information. For instance, biometric data from a subject may be collected in advance in forming a database.”), and generating, according to the feature information of the at least one person, an image of the at least one person (“devices may periodically broadcast their feature tag information on the discovery channel, and in some embodiments, thereby forming the database for an image to be compared against…. Should the facial image and the feature tag information match” in paragraph 0022, with feature tags as generated images) 
Ma teaches outline information as a semantic image feature (“plurality of position of articulation determination of detected human body contour, according to human body outline emphasis is the detection position of the part” page 2 summary of the invention paragraph 3 lines 3-4 and “Optionally, according to dynamic image data of detected person obtains positions of the nodes comprises: a dynamic video image of the detected people for joint analysis, obtaining the position of the node.” page 3 paragraph 1)
Response to Arguments
Applicant’s arguments filed 11/22/2025 have been fully considered. Due to the cancellation of claim 11, the associated 35 U.S.C 112 (a) and 35 U.S.C 112 (b) rejections have been withdrawn.
On pages 9-10 of applicant’s remarks, applicant argues that the combination of Otsuka, Alzina et al, Jacobs and/or Ma fails to teach the newly added limitation “wherein the historical frame image matched with the current frame image is a snapshot corresponding to the current frame image”, in claims 1 and 8. After an updated search, additional references were found that teach this limitation (see above claim mapping) and the corresponding 35 U.S.C 103 rejections have been updated. Accordingly, the examiner views the above argument as moot and the claims remain rejected. 
Further on, on page 10 the applicant argues that claims 2-7, 9-10, and 12-19 are allowable on the basis that that they depend on claims 1 and 8. Due to the above findings the applicant views this argument as moot and the claims still remain rejected.
Due to substantial changes being made to the rejection the request for interview is denied.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to REBECCA C WILLIAMS whose telephone number is (571)272-7074. The examiner can normally be reached M-F 7:30am - 4:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew W Bee can be reached at (571)270-5183. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/REBECCA COLETTE WILLIAMS/Examiner, Art Unit 2677                                                                                                                                                                                                        
/ANDREW W BEE/Supervisory Patent Examiner, Art Unit 2677
Read full office action
IMAGE PROCESSING METHOD AND DEVICE, STORAGE MEDIUM AND ELECTRONIC DEVICE

This examiner grants 43% of cases after interview

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

IMAGE PROCESSING METHOD AND DEVICE, STORAGE MEDIUM AND ELECTRONIC DEVICE

This examiner grants 43% of cases after interview

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email