Office Action Analysis: 18680101 — METHOD FOR EDITING MOVING IMAGE, STORAGE MEDIUM STORING EDITING PROGRAM, AND EDITING DEVICE

Office Action

§102 §103 §112
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim 5 recites “estimating a feeling of a person in the moving image based on a feature, wherein the analyzing the moving image obtains the feature by analyzing the moving image, and the extracting the specific timing is performed based on the feeling of the person.” Here the phrase “the feeling of a person” lacks indefinite scope. Further revision is required to provide clear explanation about the claimed subject matter.
Claim 9 recites “determining a characteristic of each of the one or more scenes”. This limitation fails to explain how a characteristic in a scene is determined. How the determination is made or what criteria is being used to make the determination. Further revision is required to provide clear explanation about the claimed subject matter. Claim 10, 11, 14, 15, 16, 17, and 18 are also rejected by virtue of their dependency on claim 9.
Claim 17 recites “collecting an evaluation regarding another moving image different from the moving image, wherein the determining the background sound is performed based on a result of the evaluation and other background sound included in the other moving image”. The claim is not clear and has indefinite scope. Further revision is required to provide clear explanation about the claimed subject matter.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claim(s) 1-6, 9, 11, 14, 19, and 20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by HATANO (No. JP2011077883A “Hatano”).
Regarding claim 1, Hatano teaches “A method for editing a moving image by a computer, the method comprising: obtaining the moving image;” ([0010] An object of the present invention is to analyze a still image file generated from a moving image file or a moving image file).
“extracting a specific timing in the moving image by analyzing the moving image;” (And analyzing the moving image file by the changed analysis method, and the optimal audio extraction timing for the generated still image file from the audio mode information and the analysis result of the moving image file; [0011]).
“and adding at least one sound effect to the specific timing.” (generating a sound-added still image file in association with the optimal speech and the generated still image file; [0011]).
Regarding claim 2, Hatano teaches “wherein the analyzing the moving image obtains a feature by analyzing an image included in the moving image,” ([0010] An object of the present invention is to analyze a still image file generated from a moving image file or a moving image file, thereby extracting sound at an optimal timing for the still image and generating a still image file with optimal sound).
“and the extracting the specific timing is performed based on the feature.” (the optimal audio extraction timing for the generated still image file; [0011]).
Regarding claim 3, Hatano teaches “wherein the analyzing the moving image obtains a feature by analyzing a sound included in the moving image,” (The analysis method of the moving image file is changed according to sound mode information, the moving image file is analyzed by the changed analysis method, and the generated still image is analyzed from the sound mode information and the analysis result of the moving image file; [0016]).
“and the extracting the specific timing is performed based on the feature.” (the optimal audio extraction timing for the generated still image file; [0011]).
Regarding claim 4, Hatano teaches “wherein the analyzing the moving image obtains a feature by analyzing both an image and a sound included in the moving image,” ([0010] An object of the present invention is to analyze a still image file generated from a moving image file or a moving image file, thereby extracting sound at an optimal timing for the still image and generating a still image file with optimal sound).
“and the extracting the specific timing is performed based on the feature.” (the optimal audio extraction timing for the generated still image file; [0011]).
Regarding claim 5, Hatano teaches “estimating a feeling of a person in the moving image based on a feature,” (The still image analysis means 16 detects the person who is the main subject by face detection for the input still image file, and the detected person size is a predetermined value or more (for example, the face size is If the input still image is a square that is about 1/6 to 1/7 of the short side of the still image), the mode is set to emphasize the voice expressing emotion of the person (person mode); [0026]).
“wherein the analyzing the moving image obtains the feature by analyzing the moving image,” (For example, when the still image shown in FIG. 7A is selected, the moving image file is analyzed, and a scene (FIG. 7B) where the mouth is wide open is detected; [0037]).
“and the extracting the specific timing is performed based on the feeling of the person.” (The scene in FIG. 7B is a timing when the volume is high because the child is talking (enlarged view of a person; see FIG. 7C), and the timing of FIG. 7B is the extraction timing; [0037]).
The still image is analyzed to decide whether the person is detected, if so the expression of the person is than analyzed to determine their expression. If it is set to person mode then the feeling of the person is to be determined. Capturing the face of the person can help estimate the feeling the person is expressing which can then associate the still image with a sound the reflects the feeling of the person.
Regarding claim 6, Hatano teaches “detecting, in the moving image, a person or an object that is set in advance, by analyzing an image included in the moving image,” ([0028] Next, when the input audio mode information is the person mode, the shape and sound of the person's mouth are analyzed for the same event or the same scene in the moving image file, the mouth is opened wide, and the volume).
“wherein the extracting the specific timing is performed based on a result of detection of the person or the object.” (If there is a large timing (that is, whether or not the voice is being spoken), that timing is taken as the extraction timing, and if there is no scene with the mouth wide open, facial expressions are analyzed, smiling and loud. The timing (that is, whether the voice is uttered when there is a facial expression) is set as the extraction timing; [0028]).
Regarding claim 9, Hatano teaches “A method for editing a moving image by a computer, the method comprising: obtaining the moving image;” ([0010] An object of the present invention is to analyze a still image file generated from a moving image file or a moving image file).
“dividing the moving image into one or more scenes by analyzing the moving image;” (an image is selected from a plurality of still images constituting the moving image; [0003];
An object of the present invention is to analyze a still image file generated from a moving image file or a moving image file; [0010]).
“determining a characteristic of each of the one or more scenes;” (the step of reading the moving image file, and the read moving image file Extracting a selected still image to generate a still image file; [0011]).
“and determining a background sound to be added to each of the one or more scenes, based on the characteristic determined for a corresponding one of the one or more scenes.” (analyzing the generated still image file to generate audio mode information; and analyzing the moving image file based on the audio mode information; [0011]).
Regarding claim 11, Hatano teaches “wherein the analyzing the moving image obtains a feature by analyzing a sound included in the moving image,” (The analysis method of the moving image file is changed according to sound mode information, the moving image file is analyzed by the changed analysis method, and the generated still image is analyzed from the sound mode information and the analysis result of the moving image file; [0016]).
“and the dividing the moving image into the one or more scenes is performed based on the feature.” (an image is selected from a plurality of still images constituting the moving image; [0003]; An object of the present invention is to analyze a still image file generated from a moving image file or a moving image file; [0010]; 
[0005] Furthermore, in Patent Document 3, moving image data and audio data are separated from the acquired moving image data with sound, frame division processing is performed from the separated moving image data, and the image is extracted as still image data of a plurality of frames).
Regarding claim 14, Hatano teaches “the analyzing the moving image obtains a feature by analyzing an image included in the moving image,” ([0010] An object of the present invention is to analyze a still image file generated from a moving image file or a moving image file, thereby extracting sound at an optimal timing for the still image and generating a still image file with optimal sound).
“wherein the dividing the moving image into the one or more scenes is performed based on the feature.” (an image is selected from a plurality of still images constituting the moving image; [0003]; An object of the present invention is to analyze a still image file generated from a moving image file or a moving image file; [0010]; 
[0005] Furthermore, in Patent Document 3, moving image data and audio data are separated from the acquired moving image data with sound, frame division processing is performed from the separated moving image data, and the image is extracted as still image data of a plurality of frames).
Regarding claim 19, Hatano teaches “A non-transitory computer-readable storage medium storing an editing program, which when executed by at least one processor, causes a computer to perform the method according to claim 1.” (Further, the present invention may be configured as the above-described image file generation program as a computer-readable medium or a computer-readable memory; [0049]).
Regarding claim 20, Hatano teaches “An editing device configured to perform the method according to claim 1.” ([0019] An image file generation apparatus of the present invention that implements an image file generation method according to the present invention will be described in detail below based on preferred embodiments shown in the accompanying drawings).
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 7, 8, 10, 12, 13, and 15-18 are rejected under 35 U.S.C. 103 as being unpatentable over HATANO in view of ASANO et al (No. JP2019205158A “Asano”).
Regarding claim 7, Hatano fails to teach the limitations of claim 7.  However, Asano teaches “wherein the adding the sound effect is performed by referring to a table that is set in advance to associate a type of the specific timing and the sound effect.” ([0036] FIG. 5 shows an outline of the table structure of the sound effect database of the present embodiment;
[0037] In this embodiment, sound data (sound effects) is managed using a sound effect database having a table structure as shown in FIG. 5, and the search unit 32 refers to the sound effect database to search for sound data. (S108) is executed).
The motivation for the above is to have an efficient list of sound effects predetermined for the moving image. 
Regarding claim 8, Hatano fails to teach the limitations of claim 8. However, Asano teaches “wherein the adding the sound effect is performed by dynamically generating the sound effect based on an analysis result of the moving image at the specific timing.” (The identified sound data is acquired from the sound data stored in the auxiliary storage unit 105 (S204). As a result, sound data of sound effects to be added to the moving image data (reproduction data, first reproduction data) currently being reproduced is generated (extracted); [0037]).
The motivation for the above is to have easily identify the sound effect best related to the moving image.
Regarding claim 10, Hatano fails to teach the limitations of claim 10. However, Asano teaches “wherein the analyzing the moving image includes analyzing a language included in the moving image,” ([0058] [Modification 6] In the information processing system 10 of the above-described embodiment, the voice (recognition word) recognized by the voice recognition unit is used as an onomatopoeia, and sound data (sound effect) is added to the moving image data based on the onomatopoeia;
Corresponding to this, the recognition word in the sound effect database is set not only for Japanese but also for other languages such as English, so that a search according to the language is possible).
“and the dividing the moving image is performed based on an analysis result of the language.” (Some of them include both moving image data (image information) and sound data (sound information). The moving image (moving image data) is composed of a plurality of frame-by-frame images (image data) arranged in time series. Hereinafter, reproduction data including moving image data may be simply referred to as moving image data; [0020]).
The motivation for the above is to have easy analysis of language within moving image for better efficiency. 
Regarding claim 12, Hatano fails to teach the limitations of claim 12. However, Asano teaches “extracting an utterance section of an utterer in the moving image based on the feature,” ([0038] For example, when the voice uttered by the worker using the system 10 is “don't care”, when the voice recognition unit 31 recognizes this voice (YES in S106), the search unit 32 stores the sound effect database. Referring to the sound data of the sound effect corresponding to the recognized voice “Dokan” (recognition word) is searched (S203). As shown in FIG. 5, the sound data of the sound effect corresponding to the recognition word “Dokan” is “Sound data C2 (explosive pronunciation)”, and therefore, among the sound data stored in the auxiliary storage unit 105 "Sound data C2" is acquired from (S204)).
“wherein the dividing the moving image into the one or more scenes based on the feature is performed based on the utterance section.” (Some of them include both moving image data (image information) and sound data (sound information). The moving image (moving image data) is composed of a plurality of frame-by-frame images (image data) arranged in time series. Hereinafter, reproduction data including moving image data may be simply referred to as moving image data; [0020]).
The motivation for the above is to have voice recognition analysis for the moving image for better efficiency.
Regarding claim 13, Hatano fails to teach the limitations of claim 13. However, Asano teaches “wherein the extracting the utterance section includes identifying the utterer in the moving image,” ([0039] Here, when the voice (recognition word) uttered by the operator overlaps with a plurality of genres, for example, when the recognition word is “about” shown in FIG. 5, the search unit 32 makes a sound corresponding to “about”.… In this case, one sound data is randomly specified (extracted) from the plurality of sound data and acquired).
“and the dividing the moving image into the one or more scenes based on the utterance section is performed based on a result of identification of the utterer.” (Some of them include both moving image data (image information) and sound data (sound information). The moving image (moving image data) is composed of a plurality of frame-by-frame images (image data) arranged in time series. Hereinafter, reproduction data including moving image data may be simply referred to as moving image data; [0020]).
The motivation for the above is to have efficient analysis of voice recognition for the moving image.
Regarding claim 15, Hatano fails to teach the limitations of claim 15. However, Asano teaches “wherein the determining the background sound is performed by referring to a table that is set in advance to associate a type of each of the one or more scenes and the background sound.” (Obtaining sound information corresponding to the input recognized by the recognition means from the sound information stored in the storage means, and generating sound information associated with the reproduction data, the storage means The sound information stored in the table is managed by classification, and further comprises a specifying means capable of specifying a classification of the sound information to be generated by the generating means among the classifications; [0011]).
The motivation for the above is to choose an efficient sound with a predetermined table of sounds.
Regarding claim 16, Hatano fails to teach the limitations of claim 16. However, Asano teaches “wherein the determining the background sound includes dynamically generating the background sound based on an analysis result of the moving image for each of the one or more scenes.” ([0012] According to this, sound information is generated based on the input during reproduction of the reproduction data, and the generated sound information is associated with the reproduction data, so that it is possible to improve the efficiency of the work related to image production).
The motivation for the above is to generate a compatible sound from on the analysis of the moving image for better efficiency.
Regarding claim 17, Hatano fails to teach the limitations of claim 17. However, Asano teaches “collecting an evaluation regarding another moving image different from the moving image,” (When the “Preview” button 202 is pressed (clicked), moving image data to which sound data corresponding to the sound icon displayed in the sound editing display area 204 is added, that is, the sound data and moving image data to be edited are displayed. The reproduction data (second reproduction data) obtained by synthesizing the data is reproduced from the beginning; [0028]);
Then, the editing unit 22 changes and edits the reproduction position information (reproduction time information) of the sound data thereby evaluating the sound icon IC according to the movement of the sound data on the timeline (on the time axis); [0044]).
“wherein the determining the background sound is performed based on a result of the evaluation and other background sound included in the other moving image.” (The editing unit 22 is configured to be able to execute various types of editing processing such as adding the sound data read by the reading unit 23 to the moving image data and adjusting the reproduction position of the sound data with respect to the moving image data. Yes. Further, the playback unit 21 is configured to be able to execute processing related to playback of playback data including sound data and moving image data edited by the editing unit 22; [0019]).
The evaluation is collected based on previewing the reproduced moving image and making adjustments during the playbacks. Using the user’s observations and interaction with the editing of the moving image as the evaluation being collected. The moving image becomes the evaluation to be collected and the user’s ability to view and edit the moving image constitutes the evaluation. In the editing unit, adjustments are made due to the evaluation collected from user. Thus, a proper sound could be added to the moving image based on the evaluation from the editing unit.
The motivation for the above is to better evaluate the moving image for accurate background sound chosen for moving image.
Regarding claim 18, Hatano fails to teach the limitations of claim 18. However, Asano teaches “wherein the determining the background sound is performed based on an attribute of a user who edits or views the moving image.” (The timeline display area 203 also displays the waveform HK of the sound data when the reproduction data of the moving image reproduced and displayed in the reproduction display area 202 includes sound data such as BGM (background music). It is configured. That is, the timeline display area 203 can also display reproduction position information (sound timeline) on the time axis of sound data. As a result, the operator looks at the timeline display area 203, so that the correspondence between the reproduction position of the moving image reproduced and displayed in the image reproduction display area 202, the sound reproduced accompanying the moving image, and the reproduction position are displayed; [0025]).
The motivation for the above is to have background sound chosen with the user controllability.
Hatano and Asano are analogous art as both of them are related to image file producing and information processing.
Therefore, it would have been obvious for an ordinary skilled person in the art before the effective filing date of claimed invention to have modified Hatano by adding the sound effect is performed by referring to a table that is set in advance to associate a type of the specific timing and the sound effect; adding the sound effect is performed by dynamically generating the sound effect based on an analysis result of the moving image at the specific timing; analyzing the moving image includes analyzing a language included in the moving image, and the dividing the moving image is performed based on an analysis result of the language; extracting an utterance section of an utterer in the moving image based on the feature, wherein the dividing the moving image into the one or more scenes based on the feature is performed based on the utterance section; extracting the utterance section includes identifying the utterer in the moving image, and the dividing the moving image into the one or more scenes based on the utterance section is performed based on a result of identification of the utterer; determining the background sound is performed by referring to a table that is set in advance to associate a type of each of the one or more scenes and the background sound; determining the background sound includes dynamically generating the background sound based on an analysis result of the moving image for each of the one or more scenes; collecting an evaluation regarding another moving image different from the moving image, wherein the determining the background sound is performed based on a result of the evaluation and other background sound included in the other moving image; determining the background sound is performed based on an attribute of a user who edits or views the moving image by Asano and use that with Hatano’s method of editing a moving image.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BRIGITER D PROTAZI whose telephone number is (571)272-7995. The examiner can normally be reached Monday - Friday 7:30-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Said A Broome can be reached at 5712722931. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/B.D.P./Examiner, Art Unit 2612                                                                                                                                                                                                        
/Said Broome/Supervisory Patent Examiner, Art Unit 2612
Read full office action
METHOD FOR EDITING MOVING IMAGE, STORAGE MEDIUM STORING EDITING PROGRAM, AND EDITING DEVICE

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

METHOD FOR EDITING MOVING IMAGE, STORAGE MEDIUM STORING EDITING PROGRAM, AND EDITING DEVICE

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email