Last updated: April 19, 2026
Application No. 18/380,384
DATA CONVERSION DEVICE, MOVING IMAGE CONVERSION SYSTEM, DATA CONVERSION METHOD, AND RECORDING MEDIUM

Final Rejection §103
Filed
Oct 16, 2023
Examiner
ALLEN, KYLA GUAN-PING TI
Art Unit
2661
Tech Center
2600 — Communications
Assignee
NEC Corporation
OA Round
2 (Final)
Interview Optional

— +17.1% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 53 resolved cases, 2023–2026
Examiner Intelligence

ALLEN, KYLA GUAN-PING TI View full profile →
Grants 89% — above average
Career Allow Rate
47 granted / 53 resolved
+26.7% vs TC avg
Strong +17% interview lift
Without
With
+17.1%
Interview Lift
resolved cases with interview
Typical timeline
3y 0m
Avg Prosecution
30 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
9.9%
-30.1% vs TC avg
§103
52.5%
+12.5% vs TC avg
§102
19.3%
-20.7% vs TC avg
§112
17.4%
-22.6% vs TC avg
Black line = Tech Center average estimate • Based on career data from 53 resolved cases
Office Action

§103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendments
The amendments to claims 1, 7, 9, and 10 are accepted and entered.
Claims 1-10 are pending regarding this application.
Response to Arguments
Applicant's arguments filed 01/08/2026 have been fully considered but they are not persuasive. In the Remarks, applicant states “Jalata discloses inputting position coordinates to a graph convolutional network. The reference does not disclose "inputting the posture data normalized into the angular representation to an encoder including a graph convolutional network that performs graph convolution regarding adjacent joints represented in a skeleton form as a graph structure" as recited in the claim”. However, the examiner believes Jalata teaches this assertion in  Section 4.4 Data Normalization, wherein Jalata teaches “to mitigate the effects of these noisy data, we normalized the image-plane coordinates of knees, ankles, hips, big toes, projected angles of the ankle and knee flexion, the distance between the first toe and ankle, and the distance between the left ankle and right ankle [13]”, and wherein the normalization process occurs before the graph is input into the graph convolution network as shown in section 3.3. Examiner agrees that Jalata teaches putting position coordinates to a graph convolutional network, however, these coordinates are normalized as shown above and broadly interpreted as equivalent to the claimed posture data. 
Additionally applicant argues “Dwibedi Section 1 states "We present a self-supervised representation learning technique called temporal cycle consistency (TCC) learning." Dwibedi discloses general video synchronization for representation learning. The reference does not disclose "synchronize the synchronization target moving image data with the reference moving image data by aligning timings of frames connected by the optimal path to align timing of the synchronization target motion included in the synchronization target moving image data with timing of the synchronization target motion included in the reference moving image data" as recited in the claim”. However, the examiner disagrees with this assertion. Dwibedi teaches temporal alignment between video feature embeddings wherein the normalized distance between the embeddings is optimized in order to find the optimal path/alignment in Figure 3 and Section 3 Cycle Consistent Representation Learning. However, Switonski et al. (“Dynamic Time Warping In Gait Classification of Motion Capture Data”) has also been included in this rejection to teach the Dynamic Time Warping algorithm, wherein Switonski et al. additionally teaches “aligning timings of frames connected by the optimal path to align timing of the synchronization target motion”. 
Applicant’s arguments with respect to claim 1 has been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument. As a result, Switonski et al. (“Dynamic Time Warping In Gait Classification of Motion Capture Data”) has been included to teach “normalizing posture data into an angular representation by Euler angles formed by connection lines connecting joints of a person extracted from each frame; calculating a distance between a feature amount calculated in each frame constituting reference moving image data and a feature amount calculated in each frame constituting synchronization target moving image data using dynamic time warping” (emphasis added). See the 103 rejection of claim 1 below regarding this matter.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-5 and 7-10 are rejected under 35 U.S.C. 103 as being unpatentable over Jalata et al. (“Movement Analysis for Neurological and Musculoskeletal Disorders Using Graph Convolutional Neural Network”, 2021), hereinafter Jalata in view of Dwibedi et al. (“Temporal Cycle-Consistency Learning”, 2019), hereinafter Dwibedi and Switonski et al. (“Dynamic Time Warping In Gait Classification of Motion Capture Data”), hereinafter Switonski.
Regarding claim 1, Jalata teaches a data conversion device comprising:	
               a memory storing instructions (Jalata teaches a machine with a Linux cluster CPU and GPU in Section 4.3. Experiment Settings which inherently has a memory storing instructions and a processor); and
a processor connected to the memory and configured to execute the instructions (Jalata teaches a machine with a Linux cluster CPU and GPU in Section 4.3. Experiment Settings which inherently has a memory storing instructions and a processor configured to execute instructions) to:
normalize posture data estimated in each frame constituting moving image data including a synchronization target motion (Jalata, Fig 1 shows  keypoints of the body joints are aligned both on spatial and temporal dimensions described as a skeleton sequence with N joints and T frames featuring both intra-body and inter-frame connection which is interpreted as the synchronization component of the target motion; 3.3 Graph CNN; “synchronization target motion” is described broadly in the applicant’s specification (see Fig 4 and page 9 of the specification); the target motion here is the gait as noted in Section 3.2 Data Preprocessing) into an angular representation (Jalata teaches an angular representation of the skeleton in the video frames in section 3.3 Graph Convolutional Neural Network wherein the angular representation uses the time-series data with gait metrics predicted (F(X; theta f) the theta is the critical aspect of the angular representation; (see also Figure 1) wherein “to mitigate the effects of these noisy data, we normalized the image-plane coordinates of knees, ankles, hips, big toes, projected angles of the ankle and knee flexion, the distance between the first toe and ankle, and the distance between the left ankle and right ankle [13]” in Section 4.4 Data Normalization; see also that the normalization process occurs before the graph is input into the graph convolution network as shown in section 3.3);
calculate a feature amount in an embedded space (Jalata teaches a feature amount through the teaching of the time-series data of 3D joint kinematics resulting in motion (gait metrics) in Section 3.3; Figure 2 (a) shows the embedded space representation) by inputting the posture data normalized into the angular representation to an encoder including a graph convolutional network that performs graph convolution regarding adjacent joints represented in a skeleton form as a graph structure (Jalata teaches “the keypoints denoted as blue dots in first frame and green dots in the following frames are used as input to the proposed graph convolutional neural network” in Section 3.3 Graph Convolutional Neural Network (Figure 1); here, the graph shown in Figure 1 is equivalent to the angular representation; see also Figure 3 wherein the input is a skeleton in the form of a graph structure).
	Jalata fails to teach normalizing posture data into an angular representation by Euler angles formed by connection lines connecting joints of a person extracted from each frame; calculate a distance between a feature amount calculated in each frame constituting reference moving image data and a feature amount calculated in each frame constituting synchronization target moving image data using dynamic time warping; calculate an optimal path for each frame based on the calculated distance; synchronize the synchronization target moving image data with the reference moving image data by aligning timings of frames connected by the optimal path to align timing of the synchronization target motion included in the synchronization target moving image data with the timing of the synchronization target motion included in the reference moving image data; and output the synchronization target moving image data synchronized with the reference moving image data.
**Note: Dwibedi additionally teaches calculate a feature amount in an embedded space by inputting the posture data to an encoder including a neural network (Dwibedi teaches “We show two example video sequences encoded in an example embedding space” in Figure 2; Applicant teaches that the feature amount may be an embedding in an embedded space in the specification; Dwibedi further teaches the calculation of the embedding (feature amount) through a neural network in Section 3 Cycle Consistent Representation Learning; it should be noted that the teaching of the neural network specifically being a graph convolutional network is taught in the mapping above).
Dwibedi teaches calculate a distance between a feature amount calculated in each frame constituting a reference moving image data (Dwibedi teaches aligning two video images of similar motions in Section 3 Cycle Consistent Representation Learning; here, under the broadest reasonable interpretation of the reference moving image data as claimed in the claim language, it can be interpreted that video U is the reference video; see also applicant’s description of the reference moving image data in page 14 of the applicant’s specification)  and a feature amount calculated in each frame constituting synchronization target moving image data (Dwibedi teaches calculating the distance between two embeddings (see Figure 2 & Figure 1) wherein each of the embeddings represents a video to be aligned in Section 3 Cycle Consistent Representation Learning; the two videos here are interpreted as the first and second moving image data; it should be noted that the teaching of the two instances of moving data specifically being reference moving image data and the synchronization target moving image data are included in the mapping below); 
calculate an optimal path for each frame based on the calculated distance (Dwibedi teaches “maximizing the number of points that can be mapped one-to-one between two sequences by using the minimum distance in the learned embedding space. We can achieve such an objective by maximizing the number of cycle-consistent frames between two sequences (see Figure 2)” in Section 3 Cycle Consistent Representation Learning; Dwibedi then further teaches calculating the embedding for each frame in each video sequence (U being the embeddings for each frame in video 1 and V being the embeddings for each frame in video 2) and using the distance between each frame in U with its nearest neighbor in V to determine the optimal classification (which is interpreted as equivalent to the path introduced in the claim language), this process (referred to as cycle-back classification is then repeated using the inverse (wherein U and V are switched) in Section 3 Cycle-back Classification; this classification process is interpreted as equivalent to the process in which the optimal path is calculated (see Figure 1); see also Figure 3);
synchronize the synchronization target moving image data with the reference moving image data by aligning timings of frames connected by the optimal path to align timing of the synchronization target motion included in the synchronization target moving image data with the timing of the synchronization target motion included in the reference moving image data (Dwibedi teaches a temporal alignment in Figure 1 in which the frames are aligned as a result of the nearest neighbor frames determined in the embedding space (i.e. through the process described in 3.2 Cycle-back Classification. Figure 3 shows this temporal alignment wherein U and V are temporally aligned. As noted above, U is equivalent to the reference moving image data which includes the synchronization target motion; see also Switonski’s teaching of DTW below); and 
output the synchronization target moving image data synchronized with the reference moving image data (Dwibedi teaches aligning a pair of videos using nearest neighbor matching and evaluating their accuracy in section 4.1 Evaluation).
Jalata and Dwibedi are both considered to be analogous to the claimed invention because they are in the same field of analyzing a video sequence. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Jalata to incorporate the teachings of Dwibedi and include to “calculate a distance between a feature amount calculated in each frame constituting a reference moving image data and a feature amount calculated in each frame constituting synchronization target moving image data; calculate an optimal path for each frame based on the calculated distance; synchronize the synchronization target moving image data with the reference moving image data by aligning timings of frames connected by the optimal path; and output the synchronization target moving image data synchronized with the reference moving image data”. The motivation for doing so would have been to “learn[] representations by aligning video sequences of the same action”… “for the tasks of action phase classification and continuous progress tracking of an action”, as suggested by Dwibedi in Section 1 Introduction. Therefore, it would have been obvious to one of ordinary skill at the time the invention was filed to combine Jalata with Dwibedi to obtain the invention specified in the above claim limitations. 
While Dwibedi teaches calculating a distance between a feature amount calculated in each frame constituting reference moving image data and a feature amount calculated in each frame constituting synchronization target moving image data and Jalata teaches normalizing posture data into an angular representation, Jalata and Dwibedi fail to teach normalize posture data into an angular representation by Euler angles formed by connection lines connecting joints of a person extracted from each frame; calculate a distance between a feature amount calculated in each frame constituting reference moving image data and a feature amount calculated in each frame constituting synchronization target moving image data using dynamic time warping (emphasis added).	However, Switonski teaches normalizing posture data into an angular representation by Euler angles formed by connection lines connecting joints of a person extracted from each frame (Switonski states “on the basis of gathered data the 3D coordinates of the markers are reconstructed. They are further transformed into the kinematic chain representation with specified skeleton model. The joint rotations can be coded by Euler angles or unit quaternions” in Section II Motion Capture, wherein coding by Euler angles inherently involves normalization. See also FIG. 2 (see also Jalata’s teaching of normalization above)); 
calculating a distance between a feature amount calculated in each frame constituting reference moving image data and a feature amount calculated in each frame constituting synchronization target moving image data using dynamic time warping (Switonski states “Dynamic Time Warping synchronizes two motions. It uses a cost matrix which contains the similarities between every pair of poses of compared motions. The synchronization is determined by the monotonic path connecting starting and ending points of the cost matrix with the lowest accumulated cost” in Section III Dynamic Time Warping. Section V specifically discusses applying the DTW algorithm to rotations coded by Euler angles and quaternions wherein this process occurs in the context of gait alignment).
Jalata, Dwibedi, and Switonski are all considered to be analogous to the claimed invention because they are in the same field of analyzing a video sequence. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Jalata (as modified by Dwibedi) to incorporate the teachings of Switonski and include “normalizing posture data into an angular representation by Euler angles formed by connection lines connecting joints of a person extracted from each frame; calculating a distance between a feature amount calculated in each frame constituting reference moving image data and a feature amount calculated in each frame constituting synchronization target moving image data using dynamic time warping”. The motivation for doing so would have been to accurately determining “gait identification by Dynamic Time Warping approach”, as suggested by Switonski in Section I Introduction. Therefore, it would have been obvious to one of ordinary skill at the time the invention was filed to combine Jalata and Dwibedi with Switonski to obtain the invention specified in claim 1.

Regarding claim 2, Jalata, Dwibedi, and Switonski teach the data conversion device according to claim 1, 
wherein the encoder convolves the posture data normalized into the angular representation by graph convolution and outputs an embedding in the embedded space as a feature amount (The combination of Jalata and Dwibedi both teach the above subject matter. Dwibedi teaches an encoder that convolves the posture data and outputs an embedding in the embedded space as a feature amount. See Figure 2; Dwibedi further teaches the calculation of the embedding (feature amount) through a neural network in Section 3 Cycle Consistent Representation Learning; Jalata teaches the specific process of convolving the posture data normalized into the angular representation by graph convolution in Section 3.3 Graph Convolutional Neural Network (Figure 1); here, the graph shown in Figure 1 is equivalent to the angular representation; see also Figure 3). 

Regarding claim 3, Jalata, Dwibedi, and Switonski teach the data conversion device according to claim 2, 
wherein the processor is configured to execute the instructions to calculate a distance between a feature amount related to a frame constituting the reference moving image data and a feature amount related to a frame constituting the synchronization target moving image data in a brute-force manner (Dwibedi teaches a cycle-back classification method that calculates the embedding (feature amount) of a first video and an embedding (feature amount) of a second video and calculates a distance (squared euclidaean distance) between the two embeddings by computing the soft nearest neighbor in which each the distance between each embedding of the first video and each embedding in the second video is computed in Section 3.2 Cycle-back Classification, and Figure 3), and 
calculate the optimal path for each frame based on the calculated distance (Dwibedi, see Figures 1 and 3 in which the temporal alignment represents the optimal path, and the temporal cycle consistency acts as the process used to develop the optimal alignment for each frame). Similar motivations as applied to claim 1 can be applied here.

Regarding claim 4, Jalata, Dwibedi, and Switonski teach the data conversion device according to claim 1, 
wherein the processor is configured to execute the instructions to acquire the synchronization target moving image data and the reference moving image data (Jalata teaches a machine with a Linux cluster CPU and GPU in Section 4.3. Experiment Settings which acquires the video segments to be analyzed as shown in Section 3; Jalata and Dwibedi further teach the synchronization target moving image data and the reference moving image data, respectively, as shown in claim 1); and 
estimate the posture data in each frame constituting each of the synchronization target moving image data (Jalata teaches that “the skeleton of the body is represented as an undirected graph 𝐺={𝑉,𝐸} on a skeleton sequence with 𝑁 joints and 𝑇 frames featuring both intra-body and inter-frame connection” in Section 3.3 Graph Convolutional Network; here the graph representation as shown in Figure 1 is interpreted as equivalent to the posture data and the temporal sequence represents each frame of the input video segment (synchronization target moving image data)) and the reference moving image data (Jalata teaches the posture data for each frame in the above citation, while Dwibedi teaches the reference moving image data in claim 1 and further teaches estimating posture data in each frame of the input reference moving image data in Figure 2). Similar motivations as applied to claim 1 can be applied here.

Regarding claim 5, Jalata, Dwibedi, and Switonski teach the data conversion device according to claim 4,
wherein the processor is configured to execute the instructions to acquire a plurality of pieces of the synchronization target moving image data (Jalata teaches a plurality of pieces of the target moving image data in Figure 1, wherein the video segments are equivalent to the plurality of segments; see Section 3.3. Graph Convolutional Neural Network and Section 3.1. Problem Formulation (see also claim 1 for a detailed description of the synchronization target moving image data); see also applicant’s specification which describes the pieces of motion data as “posture data” on page 7),
estimate the posture data in each of a plurality of frames constituting the plurality of pieces of synchronization target moving image data (Jalata teaches estimating the posture data in Figure 1 on which all of the frames of the video segments are analyzed),
normalize the posture data in each of the plurality of frames constituting the plurality of pieces of synchronization target moving image data into an angular representation (Jalata teaches an angular representation of the skeleton in the video frames in section 3.3 Graph Convolutional Neural Network (see also Figure 1) wherein “to mitigate the effects of these noisy data, we normalized the image-plane coordinates of knees, ankles, hips, big toes, projected angles of the ankle and knee flexion, the distance between the first toe and ankle, and the distance between the left ankle and right ankle [13]” in Section 4.4 Data Normalization; see also that the normalization process occurs before the graph is input into the graph convolution network as shown in section 3.3; the synchronization target motion here is the gait as noted in Section 3.2 Data Preprocessing; see claim 1 for a more detailed explanation of the teaching of the synchronization target motion), 
input the posture data normalized into the angular representation to the encoder to calculate a feature amount (While Jalata teaches the posture data normalized into the angular representation as shown in the previous citation, Dwibedi teaches inputting the feature data into the encoder to calculate an embedding (feature amount) as shown in Figure 2 wherein the video sequence is encoded within an embedding space),
calculate a distance between feature amounts calculated in each of the plurality of frames constituting the plurality of pieces of synchronization target moving image data (Dwibedi teaches calculating a distance between feature amounts in each of the plurality of frames between two input videos in claim 1; here, the input videos can be interpreted as the pieces of synchronization target moving image data; see Figures 1 and 2),
calculate the optimal path for each frame based on the calculated distance, and synchronize the plurality of pieces of synchronization target moving image data with each other by aligning timings of frames connected by the optimal path (Dwibedi teaches the optimal path as shown in Figure 1 in which a temporal alignment occurs based on the cycle-back classification process that calculates the distance between the embeddings of the two input videos wherein the input videos can be interpreted as the pieces of synchronization target moving image data; see Figure 2).  

Regarding claim 7, Jalata, Dwibedi, and Switonski teach a moving image conversion system (Dwibedi, see Figure 1) comprising:
the data conversion device according to claim 1 (see claim 1 and below citation); and 
a learning device including a memory storing instructions (Jalata teaches a device storing instructions in section 4.3, wherein the invention is implemented in two separate phases: training and testing, on a machine with a CPU. Here, the training process that occurs on the CPU is interpreted as equivalent to the learning device, and the testing/implementation process is interpreted as equivalent to the data conversion device. Since these are two separate processes, it is broadly interpreted as equivalent to a learning device and a data conversion device); and 
a processor connected to the memory and configured to execute the instructions to normalize posture data estimated in each frame constituting learning target moving image data including a synchronization target motion into an angular representation by calculating Euler angles formed by connection lines connecting joints of a person extracted from each frame (Jalata teaches a processor connected to the memory and configured to execute the instructions and normalizing posture data estimated in each frame constituting learning target moving image data including a synchronization target motion into an angular representation as shown in claim 1. Switonski teaches the angular representation being coded by Euler angles wherein the angular representation is formed by connection lines connecting joints of a person as shown in Figure 2),
calculate a feature amount in an embedded space by inputting the posture data normalized into the angular representation to an encoder including a graph convolutional network that performs graph convolution regarding adjacent joints represented in a skeleton form as a graph structure (Dwibedi teaches the calculated feature amount in an embedded space while Jalata teaches inputting the posture data normalized into the angular representation to an encoder including a graph convolutional network as shown above in claim 1); 
calculate a loss in accordance with the feature amount calculated by the encoder (Dwibedi teaches that the embedding (feature amount) is calculated by the encoder in Figure 3 wherein the cross-entropy loss is calculated directly on the embeddings as shown in Section 3.2. Cycle-back Classification, equation (2)); and
train the encoder based on a gradient of the calculated loss (Dwibedi teaches that the “self-supervised representation is learned by minimizing the cycle-consistency loss for all the pair of sequences in the training set” as shown in 3.4. Implementation details; here, the gradient is represented by the derivative of the cross entropy loss, as both of the versions of cycle-consistency loss are differentiable as taught in Section 3. Cycle Consistent Representation Learning; See also Section 3.3. Cycle-back Regression which recites: “all these formulations are differentiable and can conveniently be optimized with conventional back-propagation”).  
It would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Jalata and Switonski to incorporate the teachings of Dwibedi and include “calculate a feature amount in an embedded space”, “calculate a loss in accordance with the feature amount calculated by the encoder” and “train the encoder based on a gradient of the calculated loss” in a learning device. The motivation for doing so would have been “in order to incorporate temporal proximity” within the loss, and “optimize cycle consistency losses”, as suggested by Dwibedi in Section 3.3 Cycle-back Regression and Section 3.4 Implementation details, respectively. Here, Dwibedi’s teaching of the loss in the above training process and a separate testing and training phase (see Section 4 Datasets and Evaluation) can be combined with Jalata’s teaching of a separate learning and data conversion process to teach the above limitations. Therefore, it would have been obvious to one of ordinary skill at the time the invention was filed to combine Jalata and Switonski with Dwibedi to obtain the invention specified in the above limitations.
It would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Jalata (as modified by Dwibedi) to incorporate the teachings of Switonski and include “calculating Euler angles formed by connection lines connecting joints of a person extracted from each frame”. The motivation for doing so would have been to accurately determining “gait identification by Dynamic Time Warping approach”, as suggested by Switonski in Section I Introduction. Here, Switonski’s teaching of the Euler angles and the teaching of a separate testing and training phase (see Section 6.1) can be combined with Jalata’s teaching of a separate learning and data conversion process to teach the above limitations. Therefore, it would have been obvious to one of ordinary skill at the time the invention was filed to combine Jalata and Dwibedi with Switonski to obtain the invention specified in claim 7.

Regarding claim 8, Jalata, Dwibedi, and Switonski teach the moving image conversion system according to claim 7, 
wherein the processor of the learning device is configured to execute the instructions to cause an encoder to learn learning target moving image data including a learning target motion (Jalata teaches a learning process which learns target moving image data including a gait parameter (learning target motion in Section 3.1. Problem Formulation) (Dwibedi teaches an encoder to learn moving image data in Figure 2 wherein the “self-supervised representation is learned by minimizing the cycle-consistency loss for all the pair of sequences in the training set” as noted in Section 3.4. Implementation details), and 
update the encoder in accordance with a learning result (Dwibedi teaches that “given a sequence pair, their frames are embedded using the encoder network and we optimize cycle consistency losses for randomly selected frames within each sequence until convergence” in Section 3.4. Implementation Details), and 
the processor of the data conversion device is configured to execute the instructions to acquire synchronization target moving image data and reference moving image data including a synchronization target motion equivalent to the learning target motion by using the encoder updated by the learning device (While Jalata and Dwibedi teach the learning target motion as shown in the citation above, Dwibedi additionally teaches updating the encoder with the learning result (shown above) and acquiring motion data to be synchronized within two similar video sequences in Section 3. Cycle Consistent Representation Learning), and 
synchronize the synchronization target moving image data with the reference moving image data by using the encoder (Dwibedi teaches synchronizing two similar motion videos by using an encoder network in Figure 3 and Figure 1, wherein, under BRI, and as shown in claim 1, the moving image data as taught by Dwibedi can be interpreted as synchronization target moving image data and reference moving image data). 
It would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Jalata and Switonski to incorporate the teachings of Dwibedi and include “an encoder to learn learning target moving image data including a learning target motion”, “update the encoder in accordance with a learning result”, “acquiring motion data to be synchronized within two similar video sequences”, and “synchronize the synchronization target moving image data with the reference moving image data by using the encoder” in a learning device. The motivation for doing so would have been to use learned alignments to “transfer the pace of a video to other videos of the same action” and “TCC can be used for learning features from scratch and brings about significant performance boosts over plain supervised learning when there is limited labeled data”, as suggested by Dwibedi in Section 6 Applications (last paragraph) and Section 5.3 Action Phase Classification, respectively. Here, Dwibedi’s teaching of the synchronization process by using an encoder in the above training process and a separate testing and training phase (see Section 4 Datasets and Evaluation) can be combined with Jalata’s teaching of a separate learning and data conversion process to teach the above limitations. Therefore, it would have been obvious to one of ordinary skill at the time the invention was filed to combine Jalata and Switonski with Dwibedi to obtain the invention specified in claim 8.

Regarding claim 9, Jalata teaches a data conversion method executed by a computer (Jalata teaches a machine with a Linux cluster CPU and GPU in Section 4.3. Experiment Settings), the method comprising: 
normalizing posture data estimated in each frame constituting moving image data including a synchronization target motion (Jalata, Fig 1 shows  keypoints of the body joints are aligned both on spatial and temporal dimensions described as a skeleton sequence with N joints and T frames featuring both intra-body and inter-frame connection which is interpreted as the synchronization component of the target motion; 3.3 Graph CNN; “synchronization target motion” is described broadly in the applicant’s specification (see Fig 4 and page 9 of the specification); the target motion here is the gait as noted in Section 3.2 Data Preprocessing) into an angular representation (Jalata teaches an angular representation of the skeleton in the video frames in section 3.3 Graph Convolutional Neural Network wherein the angular representation uses the time-series data with gait metrics predicted (F(X; theta f) the theta is the critical aspect of the angular representation; (see also Figure 1) wherein “to mitigate the effects of these noisy data, we normalized the image-plane coordinates of knees, ankles, hips, big toes, projected angles of the ankle and knee flexion, the distance between the first toe and ankle, and the distance between the left ankle and right ankle [13]” in Section 4.4 Data Normalization; see also that the normalization process occurs before the graph is input into the graph convolution network as shown in section 3.3), 
inputting the posture data normalized into the angular representation to an encoder including a graph convolutional network that performs graph convolution regarding adjacent joints represented in a skeleton form as a graph structure to calculate a feature amount in an embedded space (Jalata teaches a feature amount through the teaching of the time-series data of 3D joint kinematics resulting in motion (gait metrics) in Section 3.3; Figure 2 (a) shows the embedded space representation; Jalata teaches “the keypoints denoted as blue dots in first frame and green dots in the following frames are used as input to the proposed graph convolutional neural network” in Section 3.3 Graph Convolutional Neural Network (Figure 1); here, the graph shown in Figure 1 is equivalent to the angular representation; see also Figure 3 wherein the input is a skeleton in the form of a graph structure).
	Jalata fails to teach calculating a distance between a feature amount calculated in each frame constituting reference moving image data and a feature amount calculated in each frame constituting synchronization target moving image data; calculating an optimal path for each frame based on the calculated distance; synchronizing the synchronization target moving image data with the reference moving image data by aligning timings of frames connected by the optimal path; and outputting the synchronization target moving image data synchronized with the reference moving image data.
**Note: Dwibedi additionally teaches inputting the posture data into network to calculate a feature amount in an embedded space (Dwibedi teaches “We show two example video sequences encoded in an example embedding space” in Figure 2; Applicant teaches that the feature amount may be an embedding in an embedded space in the specification; Dwibedi further teaches the calculation of the embedding (feature amount) through a neural network in Section 3 Cycle Consistent Representation Learning; it should be noted that the teaching of the neural network specifically being a graph convolutional network is taught in the mapping above).
Dwibedi teaches calculating a distance between a feature amount calculated in each frame constituting a reference moving image data (Dwibedi teaches aligning two video images of similar motions in Section 3 Cycle Consistent Representation Learning; here, under the broadest reasonable interpretation of the reference moving image data as claimed in the claim language, it can be interpreted that video U is the reference video; see also applicant’s description of the reference moving image data in page 14 of the applicant’s specification)  and a feature amount calculated in each frame constituting synchronization target moving image data (Dwibedi teaches calculating the distance between two embeddings (see Figure 2 & Figure 1) wherein each of the embeddings represents a video to be aligned in Section 3 Cycle Consistent Representation Learning; the two videos here are interpreted as the first and second moving image data; it should be noted that the teaching of the two instances of moving data specifically being reference moving image data and the synchronization target moving image data are included in the mapping below); 
calculating an optimal path for each frame based on the calculated distance (Dwibedi teaches “maximizing the number of points that can be mapped one-to-one between two sequences by using the minimum distance in the learned embedding space. We can achieve such an objective by maximizing the number of cycle-consistent frames between two sequences (see Figure 2)” in Section 3 Cycle Consistent Representation Learning; Dwibedi then further teaches calculating the embedding for each frame in each video sequence (U being the embeddings for each frame in video 1 and V being the embeddings for each frame in video 2) and using the distance between each frame in U with its nearest neighbor in V to determine the optimal classification (which is interpreted as equivalent to the path introduced in the claim language), this process (referred to as cycle-back classification is then repeated using the inverse (wherein U and V are switched) in Section 3 Cycle-back Classification; this classification process is interpreted as equivalent to the process in which the optimal path is calculated (see Figure 1); see also Figure 3);
synchronizing the synchronization target moving image data with the reference moving image data by aligning timings of frames connected by the optimal path to align timing of the synchronization target motion included in the synchronization target moving image data with the timing of the synchronization target motion included in the reference moving image data (Dwibedi teaches a temporal alignment in Figure 1 in which the frames are aligned as a result of the nearest neighbor frames determined in the embedding space (i.e. through the process described in 3.2 Cycle-back Classification. Figure 3 shows this temporal alignment wherein U and V are temporally aligned. As noted above, U is equivalent to the reference moving image data which includes the synchronization target motion; see also Switonski’s teaching of DTW below); and 
outputting the synchronization target moving image data synchronized with the reference moving image data (Dwibedi teaches aligning a pair of videos using nearest neighbor matching and evaluating their accuracy in section 4.1 Evaluation).
Jalata and Dwibedi are both considered to be analogous to the claimed invention because they are in the same field of analyzing a video sequence. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Jalata to incorporate the teachings of Dwibedi and include “calculating a distance between a feature amount calculated in each frame constituting a reference moving image data and a feature amount calculated in each frame constituting synchronization target moving image data; calculating an optimal path for each frame based on the calculated distance; synchronizing the synchronization target moving image data with the reference moving image data by aligning timings of frames connected by the optimal path; and outputting the synchronization target moving image data synchronized with the reference moving image data”. The motivation for doing so would have been to “learn[] representations by aligning video sequences of the same action”… “for the tasks of action phase classification and continuous progress tracking of an action”, as suggested by Dwibedi in Section 1 Introduction. Therefore, it would have been obvious to one of ordinary skill at the time the invention was filed to combine Jalata with Dwibedi to obtain the invention specified in the above claim limitations.
Jalata and Dwibedi fail to teach normalize posture data into an angular representation by Euler angles formed by connection lines connecting joints of a person extracted from each frame; calculate a distance between a feature amount calculated in each frame constituting reference moving image data and a feature amount calculated in each frame constituting synchronization target moving image data using dynamic time warping (emphasis added).	However, Switonski teaches normalizing posture data into an angular representation by Euler angles formed by connection lines connecting joints of a person extracted from each frame (Switonski states “on the basis of gathered data the 3D coordinates of the markers are reconstructed. They are further transformed into the kinematic chain representation with specified skeleton model. The joint rotations can be coded by Euler angles or unit quaternions” in Section II Motion Capture, wherein coding by Euler angles inherently involves normalization. See also FIG. 2 (see also Jalata’s teaching of normalization above)); 
calculating a distance between a feature amount calculated in each frame constituting reference moving image data and a feature amount calculated in each frame constituting synchronization target moving image data using dynamic time warping (Switonski states “Dynamic Time Warping synchronizes two motions. It uses a cost matrix which contains the similarities between every pair of poses of compared motions. The synchronization is determined by the monotonic path connecting starting and ending points of the cost matrix with the lowest accumulated cost” in Section III Dynamic Time Warping. Section V specifically discusses applying the DTW algorithm to rotations coded by Euler angles and quaternions wherein this process occurs in the context of gait alignment).
Jalata, Dwibedi, and Switonski are all considered to be analogous to the claimed invention because they are in the same field of analyzing a video sequence. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Jalata (as modified by Dwibedi) to incorporate the teachings of Switonski and include “normalizing posture data into an angular representation by Euler angles formed by connection lines connecting joints of a person extracted from each frame; calculating a distance between a feature amount calculated in each frame constituting reference moving image data and a feature amount calculated in each frame constituting synchronization target moving image data using dynamic time warping”. The motivation for doing so would have been to accurately determining “gait identification by Dynamic Time Warping approach”, as suggested by Switonski in Section I Introduction. Therefore, it would have been obvious to one of ordinary skill at the time the invention was filed to combine Jalata and Dwibedi with Switonski to obtain the invention specified in claim 9.

Regarding claim 10, Jalata teaches a non-transitory recording medium recording a program for causing a computer (Jalata teaches a machine with a Linux cluster CPU and GPU in Section 4.3. Experiment Settings, which inherently includes non-transitory media) to execute:
normalizing posture data estimated in each frame constituting moving image data including a synchronization target motion (Jalata, Fig 1 shows  keypoints of the body joints are aligned both on spatial and temporal dimensions described as a skeleton sequence with N joints and T frames featuring both intra-body and inter-frame connection which is interpreted as the synchronization component of the target motion; 3.3 Graph CNN; “synchronization target motion” is described broadly in the applicant’s specification (see Fig 4 and page 9 of the specification); the target motion here is the gait as noted in Section 3.2 Data Preprocessing) into an angular representation (Jalata teaches an angular representation of the skeleton in the video frames in section 3.3 Graph Convolutional Neural Network wherein the angular representation uses the time-series data with gait metrics predicted (F(X; theta f) the theta is the critical aspect of the angular representation; (see also Figure 1) wherein “to mitigate the effects of these noisy data, we normalized the image-plane coordinates of knees, ankles, hips, big toes, projected angles of the ankle and knee flexion, the distance between the first toe and ankle, and the distance between the left ankle and right ankle [13]” in Section 4.4 Data Normalization; see also that the normalization process occurs before the graph is input into the graph convolution network as shown in section 3.3), 
inputting the posture data normalized into the angular representation to an encoder including a graph convolutional network that performs graph convolution regarding adjacent joints represented in a skeleton form as a graph structure to calculate a feature amount in an embedded space (Jalata teaches a feature amount through the teaching of the time-series data of 3D joint kinematics resulting in motion (gait metrics) in Section 3.3; Figure 2 (a) shows the embedded space representation; Jalata teaches “the keypoints denoted as blue dots in first frame and green dots in the following frames are used as input to the proposed graph convolutional neural network” in Section 3.3 Graph Convolutional Neural Network (Figure 1); here, the graph shown in Figure 1 is equivalent to the angular representation; see also Figure 3 wherein the input is a skeleton in the form of a graph structure).
	Jalata fails to teach calculating a distance between a feature amount calculated in each frame constituting reference moving image data and a feature amount calculated in each frame constituting synchronization target moving image data; calculating an optimal path for each frame based on the calculated distance; synchronizing the synchronization target moving image data with the reference moving image data by aligning timings of frames connected by the optimal path; and outputting the synchronization target moving image data synchronized with the reference moving image data.
**Note: Dwibedi additionally teaches inputting the posture data into network to calculate a feature amount in an embedded space (Dwibedi teaches “We show two example video sequences encoded in an example embedding space” in Figure 2; Applicant teaches that the feature amount may be an embedding in an embedded space in the specification; Dwibedi further teaches the calculation of the embedding (feature amount) through a neural network in Section 3 Cycle Consistent Representation Learning; it should be noted that the teaching of the neural network specifically being a graph convolutional network is taught in the mapping above).
Dwibedi teaches calculating a distance between a feature amount calculated in each frame constituting a reference moving image data (Dwibedi teaches aligning two video images of similar motions in Section 3 Cycle Consistent Representation Learning; here, under the broadest reasonable interpretation of the reference moving image data as claimed in the claim language, it can be interpreted that video U is the reference video; see also applicant’s description of the reference moving image data in page 14 of the applicant’s specification)  and a feature amount calculated in each frame constituting synchronization target moving image data (Dwibedi teaches calculating the distance between two embeddings (see Figure 2 & Figure 1) wherein each of the embeddings represents a video to be aligned in Section 3 Cycle Consistent Representation Learning; the two videos here are interpreted as the first and second moving image data; it should be noted that the teaching of the two instances of moving data specifically being reference moving image data and the synchronization target moving image data are included in the mapping below); 
calculating an optimal path for each frame based on the calculated distance (Dwibedi teaches “maximizing the number of points that can be mapped one-to-one between two sequences by using the minimum distance in the learned embedding space. We can achieve such an objective by maximizing the number of cycle-consistent frames between two sequences (see Figure 2)” in Section 3 Cycle Consistent Representation Learning; Dwibedi then further teaches calculating the embedding for each frame in each video sequence (U being the embeddings for each frame in video 1 and V being the embeddings for each frame in video 2) and using the distance between each frame in U with its nearest neighbor in V to determine the optimal classification (which is interpreted as equivalent to the path introduced in the claim language), this process (referred to as cycle-back classification is then repeated using the inverse (wherein U and V are switched) in Section 3 Cycle-back Classification; this classification process is interpreted as equivalent to the process in which the optimal path is calculated (see Figure 1); see also Figure 3);
synchronizing the synchronization target moving image data with the reference moving image data by aligning timings of frames connected by the optimal path to align timing of the synchronization target motion included in the synchronization target moving image data with the timing of the synchronization target motion included in the reference moving image data (Dwibedi teaches a temporal alignment in Figure 1 in which the frames are aligned as a result of the nearest neighbor frames determined in the embedding space (i.e. through the process described in 3.2 Cycle-back Classification. Figure 3 shows this temporal alignment wherein U and V are temporally aligned. As noted above, U is equivalent to the reference moving image data which includes the synchronization target motion; see also Switonski’s teaching of DTW below); and 
outputting the synchronization target moving image data synchronized with the reference moving image data (Dwibedi teaches aligning a pair of videos using nearest neighbor matching and evaluating their accuracy in section 4.1 Evaluation).
Jalata and Dwibedi are both considered to be analogous to the claimed invention because they are in the same field of analyzing a video sequence. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Jalata to incorporate the teachings of Dwibedi and include “calculating a distance between a feature amount calculated in each frame constituting a reference moving image data and a feature amount calculated in each frame constituting synchronization target moving image data; calculating an optimal path for each frame based on the calculated distance; synchronizing the synchronization target moving image data with the reference moving image data by aligning timings of frames connected by the optimal path; and outputting the synchronization target moving image data synchronized with the reference moving image data”. The motivation for doing so would have been to “learn[] representations by aligning video sequences of the same action”… “for the tasks of action phase classification and continuous progress tracking of an action”, as suggested by Dwibedi in Section 1 Introduction. Therefore, it would have been obvious to one of ordinary skill at the time the invention was filed to combine Jalata with Dwibedi to obtain the invention specified in the above claim limitations.
While Dwibedi teaches calculating a distance between a feature amount calculated in each frame constituting reference moving image data and a feature amount calculated in each frame constituting synchronization target moving image data and Jalata teaches normalizing posture data into an angular representation, Jalata and Dwibedi fail to teach normalize posture data into an angular representation by Euler angles formed by connection lines connecting joints of a person extracted from each frame; calculate a distance between a feature amount calculated in each frame constituting reference moving image data and a feature amount calculated in each frame constituting synchronization target moving image data using dynamic time warping (emphasis added).	However, Switonski teaches normalizing posture data into an angular representation by Euler angles formed by connection lines connecting joints of a person extracted from each frame (Switonski states “on the basis of gathered data the 3D coordinates of the markers are reconstructed. They are further transformed into the kinematic chain representation with specified skeleton model. The joint rotations can be coded by Euler angles or unit quaternions” in Section II Motion Capture, wherein coding by Euler angles inherently involves normalization. See also FIG. 2 (see also Jalata’s teaching of normalization above)); 
calculating a distance between a feature amount calculated in each frame constituting reference moving image data and a feature amount calculated in each frame constituting synchronization target moving image data using dynamic time warping (Switonski states “Dynamic Time Warping synchronizes two motions. It uses a cost matrix which contains the similarities between every pair of poses of compared motions. The synchronization is determined by the monotonic path connecting starting and ending points of the cost matrix with the lowest accumulated cost” in Section III Dynamic Time Warping. Section V specifically discusses applying the DTW algorithm to rotations coded by Euler angles and quaternions wherein this process occurs in the context of gait alignment).
Jalata, Dwibedi, and Switonski are all considered to be analogous to the claimed invention because they are in the same field of analyzing a video sequence. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Jalata (as modified by Dwibedi) to incorporate the teachings of Switonski and include “normalizing posture data into an angular representation by Euler angles formed by connection lines connecting joints of a person extracted from each frame; calculating a distance between a feature amount calculated in each frame constituting reference moving image data and a feature amount calculated in each frame constituting synchronization target moving image data using dynamic time warping”. The motivation for doing so would have been to accurately determining “gait identification by Dynamic Time Warping approach”, as suggested by Switonski in Section I Introduction. Therefore, it would have been obvious to one of ordinary skill at the time the invention was filed to combine Jalata and Dwibedi with Switonski to obtain the invention specified in claim 10.

Allowable Subject Matter
Claim 6 is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims. 
The following is a statement of reasons for the indication of allowable subject matter.
 The best prior art of record is Jalata, Dwibedi, Switonski, Bazin et al. (U.S. Publication No. 2018/0176423 A1), hereinafter Bazin, and Plizzari et al. (“Skeleton-based Action Recognition via Spatial and Temporal Transformer Networks”, 2021), hereinafter Plizzari. Prior art applied alone or in combination with fails to anticipate or render obvious claim 6.
Claim 6
Regarding claim 6, Jalata, Dwibedi, and Switonski teach the data conversion device according to claim 5, wherein the processor is configured to execute the instructions to 
store a  used for synchronization of the synchronization target moving image data (Dwibedi teaches a soft nearest neighbor distribution which converts each embedding in U into a weighted combination of embeddings in V (where U and V each represent a unique video embedding of the videos to be synchronized) in Section 3.2. Cycle-back Classification and Figure 3; it is inferred here that the distribution is stored since it is later used in following functions and analyzed in an analysis step), 
perform inverse conversion of the synchronization target moving image data used as the reference moving image data (Dwibedi teaches the inverse conversion using the conversion array in Figure 3 and Section 3.3. Cycle-back Regression, in which the process described in the above mapping is repeated wherein U and V are switched), 
calculate the optimal path for each frame based on the calculated distance (Dwibedi teaches the optimal path as shown in Figure 1 in which a temporal alignment occurs based on the cycle-back classification process that calculates the distance between the embeddings of the two videos), 
synchronize the synchronization target moving image data with the reference moving image data by aligning timings of frames connected by the optimal path (Dwibedi teaches the optimal path as shown in Figure 1 in which a temporal alignment occurs based on the cycle-back classification process that calculates the distance between the embeddings of the two videos),
store the  used for synchronization of the synchronization target moving image data in the (Dwibedi teaches a cycle-back classification process that utilizes back propagation which inherently infers that the soft nearest neighbor distribution as shown in Figure 3 is stored at least temporarily within the encoder network in order to access that data again in the cycling back process).
Bazin further teaches setting one of the plurality of pieces of synchronization target moving image data as the reference moving image data (Bazin teaches “the exemplary input may be a set of videos of the same or similar actions. One of these videos may be considered a reference video, and the goal is to synchronize the other video(s) to this reference video” in para. [0024]; here, the input data is equivalent to the synchronization target moving image data) and 
synchronize one piece of the synchronization target moving image data used as the reference moving image data with another piece of the synchronization target moving image data not used as the reference moving image data (Bazin teaches “the exemplary input may be a set of videos of the same or similar actions. One of these videos may be considered a reference video, and the goal is to synchronize the other video(s) to this reference video” in para. [0024]; here, since piece is being broadly interpreted as a collection of frames, and the input data is equivalent to the synchronization target moving image data, it can be interpreted that the reference data is being synchronized with the synchronization target moving image data).
Plizzari further teaches with reference to the another piece of the moving image data (Plizzari teaches that “the Spatial Self-Attention module applies self-attention inside each frame to extract low-level features embedding the relations between body parts.” In Section 4.2, and “extracting inter-frame relations between nodes in time [to] learn how to correlate frames apart from each other (e.g., nodes in the first frame with those in the last one)” in Section 4.3.; Two streams are then fused (see Section 4.4) using the correlation of frames apart from each other; here, because the fusing/layering process occurs using the correlation between a frame to be fused and a separate frame in the input video, it is inferred that the frame not a part of the current fusing (synchronizing) is interpreted as the piece of the moving image data being referenced to during the synchronization process). 
However, neither Jalata, nor Dwibedi, nor Switonski, nor Bazin, nor Plizzari, nor the combination, teaches conversion array and the conversion array storage means and synchronize one piece of the moving image data used as the reference moving image data with another piece of moving image data not used as the reference moving image data, with reference to the another piece of the moving image data in combination with the remaining claim limitations, and any claims upon which claim 6 depends.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office
action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the
extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from
the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH
shortened statutory period, then the shortened statutory period will expire on the date the advisory
action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing
date of the advisory action. In no event, however, will the statutory period for reply expire later than
SIX MONTHS from the date of this final action.
Contact
Any inquiry concerning this communication or earlier communications from the examiner
should be directed to KYLA G ALLEN whose telephone number is (703)756-5315. The examiner can
normally be reached M-F 7:30am - 4:30pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a
USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use
the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor,
John Villecco can be reached on (571) 272-7319. The fax phone number for the organization where this
application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from
Patent Center. Unpublished application information in Patent Center is available to registered users. To
file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit
https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and
https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional
questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like
assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or
571-272-1000.
/Kyla Guan-Ping Tiao Allen/
Examiner, Art Unit 2661

/JOHN VILLECCO/Supervisory Patent Examiner, Art Unit 2661
Read full office action
Prosecution Timeline

Oct 16, 2023
Application Filed
Oct 06, 2025
Non-Final Rejection — §103
Jan 08, 2026
Response Filed
Feb 03, 2026
Final Rejection — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/060,260
Patent 12597119
OPERATING METHOD OF ELECTRONIC DEVICE INCLUDING PROCESSOR EXECUTING SEMICONDUCTOR LAYOUT SIMULATION MODULE BASED ON MACHINE LEARNING
2y 5m to grant Granted Apr 07, 2026
17/685,863
Patent 12588594
SYSTEM AND METHOD FOR IDENTIFYING LENGTHS OF PARTICLES
2y 5m to grant Granted Mar 31, 2026
17/986,620
Patent 12591963
SYSTEM AND METHOD FOR ENHANCING DEFECT DETECTION IN OPTICAL CHARACTERIZATION SYSTEMS USING A DIGITAL FILTER
2y 5m to grant Granted Mar 31, 2026
18/127,902
Patent 12548152
INTRACRANIAL ARTERY STENOSIS DETECTION METHOD AND SYSTEM
2y 5m to grant Granted Feb 10, 2026
17/986,817
Patent 12541833
ASSESSING IMAGE/VIDEO QUALITY USING AN ONLINE MODEL TO APPROXIMATE SUBJECTIVE QUALITY VALUES
2y 5m to grant Granted Feb 03, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
89%
Grant Probability
99%
With Interview (+17.1%)
3y 0m
Median Time to Grant
Moderate
PTA Risk
Based on 53 resolved cases by this examiner. Grant probability derived from career allow rate.