Prosecution Insights
Last updated: May 29, 2026
Application No. 18/777,936

SIGNAL PROCESSING DEVICE AND SIGNAL PROCESSING METHOD

Non-Final OA §103
Filed
Jul 19, 2024
Priority
Jan 20, 2022 — JP 2022-007337 +1 more
Examiner
YANG, NIEN
Art Unit
2484
Tech Center
2400 — Computer Networks
Assignee
Yamaha Corporation
OA Round
2 (Non-Final)
72%
Grant Probability
Favorable
2-3
OA Rounds
9m
Est. Remaining
99%
With Interview

Examiner Intelligence

Grants 72% — above average
72%
Career Allowance Rate
295 granted / 407 resolved
+14.5% vs TC avg
Strong +28% interview lift
Without
With
+28.2%
Interview Lift
resolved cases with interview
Typical timeline
2y 8m
Avg Prosecution
16 currently pending
Career history
430
Total Applications
across all art units

Statute-Specific Performance

§101
0.7%
-39.3% vs TC avg
§103
97.1%
+57.1% vs TC avg
§102
1.4%
-38.6% vs TC avg
§112
0.4%
-39.6% vs TC avg
Black line = Tech Center average estimate • Based on career data from 407 resolved cases

Office Action

§103
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Preliminary Remarks This is a reply to the amendments filed on 11/21/2025, in which, claims 1 and 7-20 are amended. Claims 1-20 remain pending in the present application with claims 1, 9, and 10 being independent claims. When making claim amendments, the applicant is encouraged to consider the references in their entireties, including those portions that have not been cited by the examiner and their equivalents as they may most broadly and appropriately apply to any particular anticipated claim amendments. Information Disclosure Statement The information disclosure statement (IDS) submitted on September 23, 2025 is in compliance with the provisions of 37 CFR 1.97 and is being considered by the Examiner. Response to Arguments Applicant's arguments filed on 11/21/2025 with respect to amended claims 1, 9, and 10 have been considered but are moot in view of the new ground(s) of rejection. Claim Rejections - 35 USC § 103 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claims 1-2 and 9-10 are rejected under 35 U.S.C. 103 as being unpatentable over Kazi (US 20200250498 A1, hereinafter referred to as “Kazi”) in view of Lee et al. (US 20160253915 A1, hereinafter referred to as “Lee”), and further in view of Takahashi et al. (JP 2013105384 A, hereinafter referred to as “Takahashi”). Regarding claim 1, Kazi discloses a signal processor comprising: attracted by the playing of the drum included in the performance image (see Kazi, paragraph [0111]: “the subject-specific feeling estimation unit 214 estimates a feeling of the image capturing person for each subject in the captured image on the basis of stored data (subject-specific degree of attention data or subject-specific sensor data)”) by inputting the performance image into a learning model (see Kazi, paragraph [0094]: “the image processing unit 215 may perform machine learning on the basis of, for example, a user selection history, and select an optimum approach of image treatment”), the learning model having been subjected to machine learning to estimate the degree of attention based on a feature quantity of the playing of the drum (see Kazi, paragraph [0089]: “The image processing unit 215 treats each subject appearing in the captured image on the basis of feeling information (feeling index) for each subject calculated by the subject-specific feeling estimation unit 214. Alternatively, the image processing unit 215 may treat each subject appearing in the captured image on the basis of a degree of attention for each subject calculated by the degree-of-attention calculation unit”); and an outputter configured to output the degree of attention estimated by the estimator (see Kazi, paragraph [0043]: “The processor 20 calculates the degree of attention for each subject on the basis of the data regarding a line-of-sight of the image capturing person, analyzes the sensor data of the image capturing person, and estimates a feeling of the image capturing person for each subject (feeling of the image capturing person directed to each subject). Then, the processor 20 treats the captured image on the basis of the degree of attention for each subject and information regarding estimated feeling (feeling index). The processor 20 generates an image closer to the landscape that the image capturing person has seen (felt) with naked eyes. The processor 20 outputs the generated image to the display”). Regarding claim 1, Kazi discloses all the claimed limitations with the exception of an image obtainer configured to obtain a performance image including playing of a drum; and an estimator configured to estimate a degree of attention from viewers. Lee from the same or similar fields of endeavor discloses an image obtainer configured to obtain a performance image including playing of a drum (see Lee, paragraph [0029]: “a musical instrument may correspond to … drum(s)” and paragraph [0090]: “music instruction system 115 may include a camera 145 that may capture video images 560 of the user during the user's performance. According to such an exemplary implementation, the user may be able to continue to observe the user interface and simultaneously view his/her performance”). Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to utilize the teachings as in Lee with the teachings as in Kazi. The motivation for doing so would ensure the system to have the ability to use the music instruction system including a camera disclosed in Lee to capture video images of the user during the user's performance wherein the musical instrument may correspond to drum(s) thus obtaining a performance image including playing of a drum in order to store the obtained performance image/video in the storage as image information so that it can be used to calculate a degree of attention attracted by the playing of the drum. Regarding claim 1, Kazi and Lee discloses all the claimed limitations with the exception of an estimator configured to estimate a degree of attention from viewers. Takahashi from the same or similar fields of endeavor discloses an estimator configured to estimate a degree of attention from viewers (see Takahashi, Page 11 of 14: “The attention level estimation device 1 is generated in step S13 by using the first learning data D1 learned by the attention level identification unit 50 and stored in the learning data storage unit 40 including the line-of-sight variation. The degree of attention to the feature quantity (feature quantity descriptor) is specified (estimated) (step S14). Through the above operation, the attention level estimation device 1 can estimate the attention level of the person H who views the video content (topic) with respect to the topic. At this time, the attention level estimation device 1 estimates the attention level without imposing a load on the person H because the body characteristics of the person H such as the body movement amount, the blink interval time, and the line-of-sight variation amount are measured by image processing … Also, the attention level estimation device 1 can accurately calculate the attention level by not using the line-of-sight feature amount for attention level estimation when there are many subtitles in the video content or when there is a lot of video motion”). Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to utilize the teachings as in Takahashi with the teachings as in Kazi and Lee. The motivation for doing so would ensure the system to have the ability to use the attention level estimation device for measuring the level of attention of a person who views video content disclosed in Takahashi to estimate the attention level of the person H who views the video content (topic) with respect to the topic using the learning data learned by the attention level identification unit thus estimating a degree of attention from viewers using an estimator in order to detect the content attracting a high degree of attention from a video so that the performance video can be edited based on the estimation result. Regarding claim 2, the combination teachings of Kazi, Lee, and Takahashi as discussed above also disclose the signal processor according to claim 1, wherein in the machine learning, the learning model has been trained to learn learning data correlated with the degree of attention (see Kazi, paragraph [0065]: “The object recognition unit 211 recognizes (detects) an object (subject) by analyzing a captured image... In the general object recognition, an object is recognized by extracting features from an input image and classifying the features by using a learned classifier. In the specific object recognition, determination is performed by extracting features from an input image and collating the features with a database generated in advance”), and the degree of attention is based on a feature quantity that depends on a movement of a drummer of the drum shown in a learning image (see Kazi, paragraph [0066]: “The degree-of-attention calculation unit 212 calculates a period of time during which a line-of-sight has been fixed in a pixel of each object on the basis of a recognition result regarding an object (subject) in the captured image from the object recognition unit 211 and the data regarding a line-of-sight”). The motivation for combining the references has been discussed in claim 1 above. Claim 9 is rejected for the same reasons as discussed in claim 1 above. Claim 10 is rejected for the same reasons as discussed in claim 9 above. Claims 3-6 and 11-20 are rejected under 35 U.S.C. 103 as being unpatentable over Kazi, Lee, and Takahashi as applied to claim 1, and further in view of Kobayashi (US 20140297012 A1, hereinafter referred to as “Kobayashi”). Regarding claim 3, the combination teachings of Kazi, Lee, and Takahashi as discussed above also disclose signal processor according to claim 1, wherein in the machine learning, the learning model has been trained to learn learning data correlated with the degree of attention (see Kazi, paragraphs [0073]-[0081]: “processing data (e.g., smile index calculated from, for example, a captured image or muscles of expression, data acquired by machine learning, and the like) calculated from one or more pieces of sensor data … The subject-specific feeling estimation unit 214 estimates a feeling of the image capturing person for each subject on the basis of the degree of attention for each subject calculated by the degree-of-attention calculation unit 212 and an analysis result regarding sensor data for the subject calculated by the sensor data analysis unit”). Regarding claim 3, the combination teachings of Kazi, Lee, and Takahashi as discussed above disclose all the claimed limitations with the exceptions of the degree of attention is based on a feature quantity that depends on whether a particular tone produced by the drum is included in a performance sound corresponding to a learning image. Kobayashi from the same or similar fields of endeavor discloses the degree of attention is based on a feature quantity that depends on whether a particular tone produced by the drum is included in a performance sound corresponding to a learning image (see Kobayashi, paragraphs [0434]-[0437]: “the object may be a character appearing in a performance scene realised as a CG image … a method of reflecting various types of metadata stored in the metadata storage unit 112 on the performance scene realised as a CG image … the visualization parameter determination unit 114 acquires from the metadata storage unit 112 the metadata obtained as a result of the analysis processing by the music analysis unit 110 (S202). For example, beats, key, chord progression, melody line, bass line, presence probability and solo probability of each instrument sound, tone and genre of music, music structure, or the like, is acquired”). Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to utilize the teachings as in Kobayashi with the teachings as in Kazi, Lee, and Takahashi. The motivation for doing so would ensure the system to have the ability to use the information processing apparatus and method disclosed in Kobayashi to reflect various types of metadata stored in the metadata storage unit on the performance image; to automatically detect feature quantity of music data changing in time series; to visualize the music data based on the feature quantity; and to analyze tone of music, melody line and beats of drum thus determining a feature quantity that depends on a particular tone or a number of tones produced by the drum is/are included in a performance sound corresponding to a learning image; or depending on a rhythm similarity degree indicating how a rhythm of a tone produced by the drum included in a performance sound corresponding to a learning image is similar to a rhythm of a tone produced by a musical instrument different from the drum; or depending on information indicating whether a part of music being played corresponds to a musical bar that is before a change in melody, the information being determined using musical score information corresponding to a learning image in order to estimate a degree of attention attracted by the playing of the drum included in the performance image by inputting the performance image into a learning model so that the degree of attention can be estimated based on a feature quantity related to the playing of the drum. Regarding claim 4, the combination teachings of Kazi, Lee, Takahashi, and Kobayashi as discussed above also disclose the signal processor according to claim 1, wherein in the machine learning, the learning model has been trained to learn learning data correlated with the degree of attention (see Kazi, paragraphs [0073]-[0081]: “processing data (e.g., smile index calculated from, for example, a captured image or muscles of expression, data acquired by machine learning, and the like) calculated from one or more pieces of sensor data … The subject-specific feeling estimation unit 214 estimates a feeling of the image capturing person for each subject on the basis of the degree of attention for each subject calculated by the degree-of-attention calculation unit 212 and an analysis result regarding sensor data for the subject calculated by the sensor data analysis unit”), and the degree of attention is based on a feature quantity that depends on a number of tones produced by the drum included in a performance sound corresponding to a learning image (see Kobayashi, paragraph [0431]: “generate a calculation formula for computing the tone of music, it is only necessary to input, along with a plurality of log spectra of music data whose tones are known as the evaluation data, decision values indicating the tone of music as the teacher data. By using a calculation formula generated from these inputs by the learning processing by the feature quantity calculation formula generation apparatus 10 and by inputting a log spectrum of a whole music piece to the calculation formula, the tone of music of the music piece is computed as the metadata per music piece”). The motivation for combining the references has been discussed in claim 3 above. Regarding claim 5, the combination teachings of Kazi, Lee, Takahashi, and Kobayashi as discussed above also disclose the signal processor according to claim 1, wherein in the machine learning, the learning model has been trained to learn learning data correlated with the degree of attention (see Kazi, paragraphs [0073]-[0081]: “processing data (e.g., smile index calculated from, for example, a captured image or muscles of expression, data acquired by machine learning, and the like) calculated from one or more pieces of sensor data … The subject-specific feeling estimation unit 214 estimates a feeling of the image capturing person for each subject on the basis of the degree of attention for each subject calculated by the degree-of-attention calculation unit 212 and an analysis result regarding sensor data for the subject calculated by the sensor data analysis unit”), and the degree of attention is based on a feature quantity that depends on a rhythm similarity degree indicating how a rhythm of a tone produced by the drum included in a performance sound corresponding to a learning image is similar to a rhythm of a tone produced by a musical instrument different from the drum (see Kobayashi, paragraph [0301]: “The similarity probability which has been converted can be visualized as FIG. 38, for example. The vertical axis of FIG. 38 corresponds to a position in the first focused beat section, and the horizontal axis corresponds to a position in the second focused beat section. Furthermore, the intensity of colours plotted on the two-dimensional plane indicates the degree of similarity probabilities between the first focused beat section and the second focused beat section at the coordinate”). The motivation for combining the references has been discussed in claim 3 above. Regarding claim 6, the combination teachings of Kazi, Lee, Takahashi, and Kobayashi as discussed above also disclose the signal processor according to claim 1, wherein in the machine learning, the learning model has been trained to learn learning data correlated with the degree of attention (see Kazi, paragraphs [0073]-[0081]: “processing data (e.g., smile index calculated from, for example, a captured image or muscles of expression, data acquired by machine learning, and the like) calculated from one or more pieces of sensor data … The subject-specific feeling estimation unit 214 estimates a feeling of the image capturing person for each subject on the basis of the degree of attention for each subject calculated by the degree-of-attention calculation unit 212 and an analysis result regarding sensor data for the subject calculated by the sensor data analysis unit”), and the degree of attention is based on a feature quantity that depends on information indicating whether a part of music being played corresponds to a musical bar that is before a change in melody, the information being determined using musical score information corresponding to a learning image (see Kobayashi, paragraph [0406]: “the melody line determination unit 288 computes the rate of appearance of pitch transition whose change amount Δo at the correct melody line of each music data. After computing the appearance rate of each pitch transition Δo for a number of pieces of music data, the melody line determination unit 288 computes, for each pitch transition Δo, the average value and the standard deviation for the appearance rate for all the pieces of music data”). The motivation for combining the references has been discussed in claim 3 above. Claim 11 is rejected for the same reasons as discussed in claim 3 above. Claim 12 is rejected for the same reasons as discussed in claim 4 above. Claim 13 is rejected for the same reasons as discussed in claim 4 above. Claim 14 is rejected for the same reasons as discussed in claim 4 above. Claim 15 is rejected for the same reasons as discussed in claim 5 above. Claim 16 is rejected for the same reasons as discussed in claim 5 above. Claim 17 is rejected for the same reasons as discussed in claim 5 above. Claim 18 is rejected for the same reasons as discussed in claim 5 above. Claim 19 is rejected for the same reasons as discussed in claim 5 above. Claim 20 is rejected for the same reasons as discussed in claim 5 above. Claims 7 and 8 are rejected under 35 U.S.C. 103 as being unpatentable over Kazi, Lee, and Takahashi as applied to claim 1, and further in view of Sneidern et al. (US 20160080835 A1, hereinafter referred to as “Sneidern”). Regarding claim 7, the combination teachings of Kazi, Lee, and Takahashi as discussed above also disclose the signal processor according to claim 1, wherein the performance image includes a plurality of performance images (see Lee, paragraph [0090]: “music instruction system 115 may include a camera 145 that may capture video images 560 of the user during the user's performance”). Regarding claim 7, the combination teachings of Kazi, Lee, and Takahashi as discussed above disclose all the claimed limitations with the exceptions of the signal processor further comprises an editor configured to: select, from the plurality of performance images, a first performance image having a score equal to or higher than a threshold, the score being determined based on the degree of attention estimated by the estimator; and generate a first video image using the selected first performance image. Sneidern from the same or similar fields of endeavor discloses the signal processor further comprises an editor configured to: select, from the plurality of performance images, a first performance image having a score equal to or higher than a threshold, the score being determined based on the degree of attention estimated by the estimator (see Sneidern, paragraph [0028]: “The numerical representation may indicate a feature's level of interestingness. The baseline feature set may be a set of threshold values for each feature. Source video clips with one or more features that exceed the threshold values of the baseline feature set may be deemed interesting for inclusion in a compilation video”); and generate a first video image using the selected first performance image (see Sneidern, paragraph [0029]: “The feature vector may be evaluated against a baseline feature set that includes threshold values for the features. Video clips associated with features that exceed one or more of the thresholds in the baseline feature set may be organized into a compilation video based on the metadata”). Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to utilize the teachings as in Sneidern with the teachings as in Kazi, Lee, and Takahashi. The motivation for doing so would ensure the system to have the ability to use the system and method for creating a compilation video disclosed in Sneidern to use the numerical representation to indicate a feature's level of interestingness; to evaluate feature vector against a baseline feature set that includes threshold values for the features; and to generate a compilation video using video clips associated with features that exceed one or more of the thresholds thus selecting a performance image having a score equal to or higher than a threshold from the plurality of performance images and generating a video image using the selected performance image in order to identify an image that conveys strong interest so that a shortened performance video can be generated with selected images. Regarding claim 8, the combination teachings of Kazi, Lee, Takahashi, and Sneidern as discussed above also disclose the signal processor according to claim 7, wherein the image obtainer is configured to obtain a plurality of images of the playing of the drum (see Lee, paragraph [0040]: “The visual images may be displayed on display 120. Music instruction system 115 may send and/or receive visual images to/from other persons (e.g., other users, a remote instructor, etc.) via a network (e.g., the Internet, etc.)”), and the editor is further configured to: identify, from the plurality of performance images, a set of performance images that were taken at an identical time (see Sneidern, paragraph [0083]: “the baseline feature set may include time or date-based features. For example, at block 610 a date or a date range within which video clips were recorded may be identified as a feature. Video frames and video clips of the one or more source videos may be given a relevance score at block 615 based on the time it was recorded”); select, from the identified set of performance images, a second performance image having a score equal to or higher than the threshold (see Sneidern, paragraph [0028]: “The numerical representation may indicate a feature's level of interestingness. The baseline feature set may be a set of threshold values for each feature. Source video clips with one or more features that exceed the threshold values of the baseline feature set may be deemed interesting for inclusion in a compilation video”); and generate a second video image using the selected second performance image (see Sneidern, paragraph [0029]: “The feature vector may be evaluated against a baseline feature set that includes threshold values for the features. Video clips associated with features that exceed one or more of the thresholds in the baseline feature set may be organized into a compilation video based on the metadata”). The motivation for combining the references has been discussed in claims 1 and 7 above. Conclusion Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. Any inquiry concerning this communication or earlier communications from the examiner should be directed to NIENRU YANG whose telephone number is (571)272-4212. The examiner can normally be reached Monday-Friday 10AM-6PM EST. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, THAI TRAN can be reached at 571-272-7382. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. NIENRU YANG Examiner Art Unit 2484 /NIENRU YANG/Examiner, Art Unit 2484 /THAI Q TRAN/Supervisory Patent Examiner, Art Unit 2484
Read full office action

Prosecution Timeline

Show 1 earlier event
Aug 21, 2025
Non-Final Rejection mailed — §103
Nov 21, 2025
Response Filed
Feb 11, 2026
Final Rejection mailed — §103
Apr 08, 2026
Examiner Interview Summary
Apr 08, 2026
Applicant Interview (Telephonic)
Apr 13, 2026
Response after Non-Final Action
May 11, 2026
Request for Continued Examination
May 22, 2026
Response after Non-Final Action

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12634571
CAMERA SYSTEM FOR AERIAL MAPPING
1y 6m to grant Granted May 19, 2026
Patent 12626724
COHERENT SLOW-MOTION VIDEO DISPLAYED ACROSS MULTIPLE DEVICES
2y 11m to grant Granted May 12, 2026
Patent 12620114
IMAGE PROCESSING DEVICE AND MACHINE TOOL
2y 8m to grant Granted May 05, 2026
Patent 12604024
REPRODUCTION DEVICE, REPRODUCTION METHOD, AND RECORDING MEDIUM
1y 5m to grant Granted Apr 14, 2026
Patent 12592259
SYSTEMS AND METHODS TO EDIT VIDEOS TO REMOVE AND/OR CONCEAL AUDIBLE COMMANDS
3y 9m to grant Granted Mar 31, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

2-3
Expected OA Rounds
72%
Grant Probability
99%
With Interview (+28.2%)
2y 8m (~9m remaining)
Median Time to Grant
Moderate
PTA Risk
Based on 407 resolved cases by this examiner. Grant probability derived from career allowance rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month