Office Action Analysis: 18498153 — METHODS AND SYSTEMS FOR FACILITATING ANNOTATION OF VIDEOS

Office Action

§101 §103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Specification
The use of the term
Z-Wave
ZigBee
WiMAX
Amazon SageMaker, 
which is a trade name or a mark used in commerce, has been noted in this application. The term should be accompanied by the generic terminology; furthermore the term should be capitalized wherever it appears or, where appropriate, include a proper symbol indicating use in commerce such as ™, SM , or ® following the term.
Although the use of trade names and marks used in commerce (i.e., trademarks, service marks, certification marks, and collective marks) are permissible in patent applications, the proprietary nature of the marks should be respected and every effort made to prevent their use in any manner which might adversely affect their validity as commercial marks.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to judicial exception, in particular an Abstract Idea falling under (c) mental process group 9 without significantly more. This judicial exception is not integrated into a practical application because the additional limitations provide insignificant extra-solution activity and simply implementing the abstract idea on generically recited computer elements. The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception because extracting frames and sending notifications are well understood, routine, conventional computer functions as recognized by court decisions listed in MPEP 2106.05(d).
	Step 1: The claims in question are directed to primarily a computer implemented method/process. The corresponding apparatus are congruent in scope and understood to be directed to a machine for the purposes of analysis at Step 1. (Step 1: YES)
	Step 2A. Prong One: Step 2A Prong One of the eligibility analysis evaluates whether the claim recites judicial exception (Law of Nature, a Natural Phenomenon, or an Abstract Idea). MPEP 2106.4, subsection II, states a claim “recites” a judicial exception when the judicial exception is “set forth” or “described” in the claim. Claim 1 at a high level recites extracting frames using a neural network (limitation A), assigning tasks using a neural network (limitation B), determining/comparing a score and threshold using a neural network (limitation C), flagging annotation (limitation D), and sending notifications (limitation E). 
Limitations B recites assigning which encompasses observing the matchability score and historical annotation performance and providing a judgement or determination on assigning the task.
Limitation C recites determination of whether a score is below a threshold which is an evaluation that could practically be performed in the human mind.
Limitation D recites flagging annotation which encompasses an observation of the outcome of limitation C and evaluating that outcome.
Such mental observations, evaluations, and judgements fall under mental grouping processing (performed in the human mind including observations, evaluations, judgments, and opinions).  For example, a manager could evaluate a worker’s specialty or task focus  and the workers past performance rate at a specific task, and use this evaluation to assign specific tasks to specific workers to match their specialty and performance rate. The manager could evaluate the result of a task by calculating a score for the task and comparing it to a benchmark threshold. This comparison could then be used by the manager to flag the task for not meeting the standards of the evaluation. Claim 2 at a high level recites finding the distance between the annotation task and historical annotation task. The calculation of distance is a simple mathematical formula that could be performed in the human mind by evaluation. Claim 3 recites at a high-level determining completion rate and rate of disagreement, which encompasses observing the annotations and providing judgements on both the completion and disagreement. The calculating of a rate is simple mathematics that could be performed in the human mind. Claim 5 recites at a high level the use of completion rates, accuracy, and inter-annotator agreement, which encompasses evaluating and observing the annotation results and preforming simple mathematics to provide a rate which could be performed in the human mind. Claim 6 at a high level recites the sending of notifications, similar to limitation E of claim 1. Claim 7 at a high level recites the tasks consists of object recognition and detection, segmentation, pose estimation, action recognition, and attribute recognition. Each of the recited tasks can be performed by the human mind, for example a worker can detect and identify an animal [or object] in a video, or a worker can observe a video an evaluate the movement of a person to estimate the pose or behavior of person in the video. Claims 8-10 at a high level recite training using feedback and ground-truths of the task, the human mind can be trained through feedback and ground-truth. For example, a worker is trained at a new task based a previously completed and verified example of the outcome of the task and given feedback from a supervisor. Claims 11-20 similarly contains mental processing of observations, evaluations, judgements, and opinions. (Step 2A, Prong One: YES).
	Step 2A. Prong Two: Step 2A, Prong Two of the eligibility analysis evaluates whether the claim as a whole integrates the recites judicial exception into a practical application of the exception or whether the claim is “directed to” the judicial exception. This evaluation is performed by (1) identifying
whether there are any additional elements recited in the claim beyond the judicial exception, and
(2) evaluating those additional elements individually and in combination to determine whether
the claim as a whole integrates the exception into a practical application. See MPEP 2106.04(d).
‘Additional elements’ are generally limitations are generally feature/limitations/steps recited in the
claim beyond the judicial exceptions. MPEP 2106 comprises of limitations that indicate whether or not
there is integration. MPEP 2106.05(a) improvements, (b) particular machine, (c) particular
transformation, and (e) other meaningful limitations generally concern limitation that are indicative of
integration, whereas 2106.05(d) well-understood, routine, conventional activity, (f) mere instructions to
apply, (g) insignificant extra-solution activity, and (h) field of use generally concern limitations that are
not indicative of integration. Limitations A and E recite “receiving” (extracting) and “outputting” (sending) which amount to mere data gathering and output recited at a high level of generality, and thus are insignificant extra-solution activity. MPEP 2106.05(g) states insignificant extra-solution activity is generally understood as activities incidental to the primary process or product that are merely nominal or tangential addition to the claim (examples include transmitting, storing, and outputting information). Limitations A, B, and C are recited as being performed by a neural network. The neural network is recited at a high level of generality and amounts to no more than mere instructions to implement an abstract idea on a generic computer. The neural network is used to generally apply the abstract idea without limiting the training neural network functions. The neural network is described at a high level such that it amounts to using a computer with a generic neural network to apply the abstract idea. These limitations only recite that a neural network is used to produce the outcome without any details about how the outcomes are accomplished. Claims 2-5 and 7 do not include additional elements beyond abstract ideas of mental processes in the form of observing and evaluating to complete simple mathematics. Claim 6 at a high level recites sending notification which is mere outputting of data, and thus are insignificant extra-solution activity.  Claims 8-9 recite use of a neural network. The neural network is recited at a high level of generality and amounts to no more than mere instructions to implement an abstract idea on a generic computer. These claims only recite that the neural network is trained without detail of specific detail in how the feedback and ground-truth are used to train the neural network. The additional elements that are present do not integrate the recited judicial exception into practical application (Step 2A, Prong Two: NO), and the claims are directed to the judicial exception. (Step 2A; YES)
Step 2B: Step 2B of the eligibility analysis evaluates whether the claim as a whole amount to significantly more than the recited exception i.e., whether any additional element, or combination of additional elements, adds an inventive concept to the claim. See MPEP 2106.05. As explain in Step 2A, claim 1 contain additional elements that relate to the receiving and outputting of data and the neural network are merely applied to the judicial exception. The considerations of Step 2A Prong 2 and Step 2B overlap, but differ in that Step 2B requires the consideration of the claim as a combination of the limitation, see MPEP 2106.05 subsection I.A. At Step 2B, the evaluation of the insignificant extra-solution activity takes into account whether or not the activity is well understood, routine, and conventional in the field. See MPEP 2106.05(g). The recitation of extracting frames in a video and sending notifications of the flagged annotation in claim 1 is recited at a high level of generality and amounts to receiving or transmitting data, which is well-understood, routine, conventional activity. See MPEP 2106.05(d), subsection II. The limitations remain insignificant extra-solution activity even upon reconsideration. Claim 6 is similarly analyzed and found to amount to be well-understood, routine, and conventional activity. Claims 2-5 and 7 do not include additional elements. Claims 8-10 include the generic use of a neural network without additional elements. Even when considered in combination, these additional elements represent mere instructions to implement an abstract idea or other exception on a computer and insignificant extra-solution activity, which do not provide an inventive concept. (Step 2B: NO).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 3-5, 7-11, 13-15, and 17-20 are rejected under 35 U.S.C. 103 as being unpatentable over US 12406480 (hereinafter “Sahni”) in view of Sharma US10977518 (hereinafter “Sharma”).

Regarding claim 1, Sahni teaches a method for facilitating human annotation of videos comprising (see col 4 lines 4-7 and 10-17, an annotation system for videos that send the content [videos] to participating annotators):  
extracting, using one or more trained neural networks, frames in a video for annotation based on an annotation task selected by a user (see col 7 lines 43-51, the data selector selects content items [video or images] from the content database for annotating that are associated with a job [a job is created and managed based on a job description provided by an administrator [user], col 4 line 51 through col 5 line 5]. The selection criteria for selecting content may be associated with a prediction metric made by the machine learning engine [col 7 lines 36-38] which may use neural networks [col 6 lines 25-27]);  
assigning, using the one or more trained neural networks, the annotation task to one or more annotators based on matchability scores (see col 5 lines 55-62, a job description may identify a set of annotators for receiving the content items for the annotation job based on specific criteria including field of expertise and level of experience [matchability score]. See figure 1, the annotation system 100 includes machine learning engine 126 which may use neural networks [col 6 lines 25-27]),  

in response (see col 9 lines 35-41, a user [annotator] can flag content, the content may be flagged for low quality); 
and sending notifications of the flagged annotations to the user (see col 9 lines 35-37, the content flagged by the annotator is flagged for review by the administrator [user], interpreted as the content is flagged and sent to the user).

Sahni does not teach assigning the annotation task to one or more annotators based on historical annotation performance of the annotators
determining, using the one or more trained neural networks, whether one or more confidence scores of one or more annotations in annotated frames received from the one or more annotators are below a threshold;
in response to a determination that the one or more confidence scores are below the threshold, flagging the one or more annotations associated with the one or more confidence scores below the threshold.

Sharma teaches assigning the annotation task to one or more annotators based on historical annotation performance of the annotators (see col 6 lines 28, the Annotation Job Controller may select a set of annotators for the job via the annotator information in the annotation job repository. The annotator information may include the quality score [see figure 1 and col 7 lines 21-23], the quality score [historical annotation performance indicating the level of correctness of the annotations provides by the annotator over time [col 7, lines 45-49]);
determining, using the one or more trained neural networks, whether one or more confidence scores of one or more annotations in annotated frames received from the one or more annotators are below a threshold (see col 8 lines 21-28 and line 63 through col 9 line 5, the use of a confidence score [indicating the level of confidence that the annotation is proper, col 7 lines 3-12] to compare to a threshold as part of the active learning model [which may comprise of a machine learning model such as a convolutional neural network trained with input data, col 8 lines 51-54);
in response to a determination that the one or more confidence scores are below the threshold, flagging the one or more annotations associated with the one or more confidence scores below the threshold (see col 8 lines 63-65, when the confidence score is below a threshold the data element may be marked as “bad”).

Sahni and Sharma are analogous art because they are from the same field of endeavor of a video annotation system, which includes machine learning using neural networks, that selects a set of annotators for the annotation jobs and marks the content based on its quality.
Before the effective filling date of the invention, it would have been obvious to one of ordinary skill in the art to modify Sahni to incorporate the assigning tasks using historical annotation performance, determining whether the confidence score of an annotation is below a threshold, and flagging annotation below a threshold as taught by Sharma. The motivation for doing so would have been to give higher weight to the annotations of the annotators with higher historical performance and compare the annotation of a task across annotators (Sharma, col 7 lines 13-44).

Regarding claim 3, Sharma and Sahni teach the method of claim 1. Sharma teaches the historical annotation performance of the annotators is determined based on (see col 7 lines 50-60, the quality score [historic annotation performance] is may be based on the difference between the annotators annotation and the consolidated annotation [a combination of multiple annotations from multiple annotators, col 7 lines 6-10]).  
Sharma does not teach annotating completion rates.
Sahni teaches annotating completion rates (see col 8 lines 44-54, the cycle manager tracks the progress of the annotation job and determines when the job is complete using predefined completion criteria which includes obtaining labels for at least a predefined number of content items [completion rate]). 

Regarding claim 4, Sharma and Sahni teach the method of claim 1. Sharma teaches the annotators are one or more servers or persons on an annotation platform (see col 4 lines 58-63, annotators are humans or machine learning models that provide annotations of some sort).

Regarding claim 5, Sharma and Sahni teach the method of claim 1. Sharma teaches after assigning, the method further comprises monitoring annotation progress, the annotation progress comprising (see col 7 lines 15-40, the use of the annotation consolidation module to combine a collection of annotations and determine how similar or different they are, and using annotators with a higher quality score to weight scores and determine what is the correct annotation [accurate annotation]).  
Sharma does not teach completion rates.
Sahni teaches after assigning, the method further comprises monitoring annotation progress, the annotation progress comprising completion rates and accuracy (see col 8 lines 44-54, the cycle manager tracks the progress of the annotation job and determines when the job is complete using predefined completion criteria which includes obtaining labels for at least a predefined number of content items [completion rate] and at least a predefined average prediction metric for prediction made [accuracy]).

Regarding claim 7, Sharma and Sahni teach the method of claim 1. Sharma teaches the annotation task is selected from object recognition and, segmentation, pose estimation, action recognition, attribute recognition of people, or a combination thereof detection (see col 5 lines 38-45, the user may select between annotation tasks including image classification, object detection with additional embodiments allowing for different annotation tasks).

Regarding claim 8, Sahni and Sharma teach the method of claim 1. Sahni teaches the one or more trained neural networks are trained based on historical manual selections of frames in historical videos associated with historical annotation tasks and feedback of historical frame selections from the user (see col 4 lines 45-51, training the machine learning model [which may include neural networks] which includes identifying training content items [interpreted as identified frames selected] and the associated annotations [interpreted as feedback]).  

Regarding claim 9, Sahni and Sharma teach the method of claim 1. Sharma teaches the one or more trained neural networks are trained based on historical annotator selections by the user in association with the annotation task and feedback of generated assignment (see col 7, the weighted selection of annotator based on the annotator quality score and the updating [training] of quality scores based on the contribution to the consolidated annotations [interpreted as feedback of the annotation job assignment]). 

Regarding claim 10, Sharma and teach the method of claim 1. Sharma teaches the one or more trained neural networks are trained based on feedback of the flagged annotations from the user and manual flags marked by the user (see col 8 lines 46-60 the machine learning model such as a convolutional neural network is trained using input data with consolidated annotations, as well as metadata including confidence score of annotations [used to flag annotations], individual annotations, and quality scores of annotators. The job submitters [user] may select representation of “bad” examples to aid as part of the instructions [col 2, lines 43-50], interpreted as manually marked by the user).

	Claims 11, 13-15, and 17-20 are analogous system to method claims 1, 3-5, and 7-10, respectively, thus are analyzed and rejected similar to claims 1, 3-5, and 7-10.

Claims 2 and 12 are rejected under 35 U.S.C. 103 as being unpatentable over Sahni in view of Sharma in view of Jannink US 20210362363 (hereinafter “Jannink”).

Regarding claim 2, Sahni and Sharma teach the method of claim 1.
Sahni teaches the matchability scores of the annotators are determined based on (see col 5 lines 55-67, the use of the job description including specific criteria of the annotator [annotator task] and the annotator having specific criteria including field of expertise [annotation task performed by the annotator]  to determine which annotators or group is assigned the job. Sahni does not explicitly teach the distance between the annotation task and the historical annotation task; an additional reference will provide obviousness).
Sahni nor Sharma explicitly teach the distances between the annotation task and historical annotation tasks performed by the annotators.

Jannink teaches the distances between the annotation task and historical annotation tasks performed by the annotators (see paragraph 0070, the classifier may compute a similarity between representation of user request [annotation task] and a representation of previously received user requests [historical annotation task]. The user request comprising a task to be performed [paragraph 0040])

Sharma, Jannink, and Sahni are analogous art because they are from the same field of endeavor of using machine learning models with annotation task based on user inputs.
Before the effective filling data of the inventions, it would have been obvious to one of ordinary skill in the art to modify Sahni and Sharma to incorporate the distance between the annotation task and the historical annotation tasks preformed as taught by Jannink. The motivation for doing so would have been to make selections based on the smallest distance when comparing possible tasks (Jannink, paragraph 0070).

Claim 12 is analogous system to the method of claim 2, thus claim 12 is analyzed and rejected similar to claim 2.

Claims 6 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Sahni in view of Sharma in view of Williams US12321385 (hereinafter “Williams”).

Regarding claim 6, Sharma and Sahni teach the method of claim 1.  
Sharma nor Sahni teach the method further comprises sending the notifications of the flagged annotations to the one or more annotators.  

Williams teaches the method further comprises sending the notifications of the flagged annotations to the one or more annotators (see col 4 lines 60-67, if the verification of the successful labelling fails the verification task may be re-issued to a different annotator)

Sharma, Williams, and Sahni are analogous art because they are from the same field of endeavor of the automating of video annotation or labelling by sending videos to annotators for consideration.
Before the effective filling data of the inventions, it would have been obvious to one of ordinary skill in the art to modify Sahni and Sharma to incorporate sending of a failed verification [flagged annotation] to a different annotator as taught by Williams. The motivation for doing so would have been to verify the labelling and determine if the video needs to be to be flagged for alternative means of labeling (Williams, col 5 lines 1-5).

Claim 16 is analogous system to the method of claim 6, thus claim 16 is analyzed and rejected similar to claim 6.


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Please see the attached 892 notice of reference cited.

Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to EMILY R. HAUK whose telephone number is (571)272-5966. The examiner can normally be reached M-F 8:00-5:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Chan Park can be reached at 571-272-7409. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/EMILY HAUK/
Examiner, Art Unit 2669                                                                                                                                                                                         /CHAN S PARK/Supervisory Patent Examiner, Art Unit 2669
Read full office action
METHODS AND SYSTEMS FOR FACILITATING ANNOTATION OF VIDEOS

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

METHODS AND SYSTEMS FOR FACILITATING ANNOTATION OF VIDEOS

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email