Last updated: May 04, 2026
Application No. 18/395,909
CONTENT-AWARE ARTIFICIAL INTELLIGENCE GENERATED FRAMES FOR DIGITAL IMAGES

Non-Final OA §103
Filed
Dec 26, 2023
Examiner
CASCAIS, JUSTIN PHILIP
Art Unit
2674
Tech Center
2600 — Communications
Assignee
Microsoft Technology Licensing, LLC
OA Round
1 (Non-Final)
Interview Optional

— +15.8% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 71% grant rate with +15.8% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.
Based on 49 resolved cases, 2023–2026
Examiner Intelligence

CASCAIS, JUSTIN PHILIP View full profile →
Grants 71% — above average
Career Allowance Rate
35 granted / 49 resolved
+9.4% vs TC avg
Strong +16% interview lift
Without
With
+15.8%
Interview Lift
resolved cases with interview
Typical timeline
2y 10m
Avg Prosecution
19 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
15.3%
-24.7% vs TC avg
§103
57.2%
+17.2% vs TC avg
§102
20.8%
-19.2% vs TC avg
§112
6.7%
-33.3% vs TC avg
Black line = Tech Center average estimate • Based on career data from 49 resolved cases
Office Action

§103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The IDS(s) dated 4/17/2025 has been considered and placed in the application file.  

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 1-5, 9-10, 12-16, and 18-20 is/are rejected under 35 U.S.C. 103 as obvious over Brdiczka et al (US 20240127511 A1, hereafter referred to as Brdiczka) in view of Wu et al (US 20210224315 A1, hereafter referred to as Wu).
Claim 1
Regarding Claim 1, Brdiczka teaches A data processing system comprising:
a processor (Brdiczka in ¶100 discloses a processor); and
a memory storing executable instructions that, when executed, cause the processor alone or in combination with other processors (Brdiczka in ¶100 discloses memory) to perform operations of:
receiving an electronic copy of an image from an application (Brdiczka in ¶78-80 discloses receiving inputs including a source image which is a previously generated image);
receiving a natural language prompt input by a user of the application requesting that the application generate a digital picture frame for the image (Brdiczka in ¶47-49 discloses natural language prompt with frame and style), 
the natural language prompt including a description of the frame to be created for the image (Brdiczka in ¶47-49 discloses prompt describes a frames size, configuration, style, etc.);
analyzing the natural language prompt using a key-phrase extraction unit to extract one or more key phrases from the natural language prompt that describe a topic of the frame to be generated for the image (Brdiczka in ¶49-51 discloses a control language identifier to analyze a topic of the frame to be generated for the image);
analyzing the set of candidate frame images using an image placement unit to obtain a set of framed images based on the image and the candidate frame images (Brdiczka in ¶40-42, 62 discloses applying generated frame to source image while applying conventional auto-fit/crop techniques); and
presenting the set of framed images on a user interface of the application (Brdiczka in FIG. 6, ¶64 discloses displaying composed images).
Brdiczka does not explicitly teach all of providing the one or more key phrases as an input to a retrieval engine; analyzing the one or more key phrases with the retrieval engine to identify a set of candidate frame images from among a plurality of frame images in a labeled frame images datastore.
However, Wu teaches providing the one or more key phrases as an input to a retrieval engine (Wu in Abstract, ¶31 discloses a search request includes an NL statement where phrases are analyzed);
analyzing the one or more key phrases with the retrieval engine to identify a set of candidate frame images from among a plurality of frame images in a labeled frame images datastore (Wu in Abstract, ¶31 discloses analyzing phrases to perform a relevance determination for image results).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Brdiczka by analyzing key phrases to identify candidate frame images from a labeled datastore that is taught by Wu, since both reference are analogous art in the field of natural language processing for image retrieval and composition; thus, one of ordinary skilled in the art would be motivated to combine the references since Brdiczka’s generative framing and source image composition with Wu’s key phrase retrieval from a labeled image datastore yields the predictable result of faster, more efficient frame selection for common or pre-curated styles while maintaining flexibility for novel descriptions.
Thus, the claimed subject matter would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention.
Claim 2
Regarding Claim 2, Brdiczka in view of Wu teaches The data processing system of claim 1, wherein analyzing the set of candidate frame images using the image placement unit to obtain the set of framed images further comprises:
analyzing the image using an object detection model to detect one or more objects in the image and output object information for the one or more objects (Brdiczka in ¶25, 33-35 discloses detecting and outputting information related to objects in an image); and
placing the image in each candidate frame image of the candidate frame images using an image cropping model, the image cropping model being trained to analyze the object information and the candidate frame image to crop the image to fit in the candidate frame image and output a candidate framed image (Brdiczka in ¶40-42, 62-63 discloses applying generated frame to source image while applying conventional auto-fit/crop techniques).
Claim 3
Regarding Claim 3, Brdiczka in view of Wu teaches The data processing system of claim 2, wherein the memory further includes instructions configured to cause the processor alone or in combination with other processors to perform operations of: 
analyzing each framed image of the set of framed images using an image harmonization model trained to adjust attributes of the candidate frame image to harmonize an appearance of the image and an appearance of the candidate frame image included in the framed image (Brdiczka in ¶40-42, 62-63 discloses seamless layering for visual coherence).
Claim 4
Regarding Claim 4, Brdiczka in view of Wu teaches The data processing system of claim 1, 
wherein retrieval engine determines a similarity score for each candidate frame image of the set of candidate frame images (Wu in Abstract, ¶25, 31-37 discloses re-ranking via similarity), and 
wherein the memory further includes instructions configured to cause the processor alone or in combination with other processors to perform and operation of ranking the set of candidate frame images based on the similarity score (Wu in Abstract, ¶25, 31-37 discloses re-ranking via similarity).
Claim 5
Regarding Claim 5, Brdiczka in view of Wu teaches The data processing system of claim 4, wherein the retrieval engine determines the similarity score by performing operations of:
mapping the one or more key phrases to a multidimensional vector space using a transformer model to generate an encoded representation of the one or more key phrases (Brdiczka in ¶69 discloses transformers encoding key phrases);
comparing the encoded representation of the one or more key phrases with encoded representations of labels associated with each of the plurality of frame images to determine the similarity score for each candidate frame image of the set of candidate frame images (Wu in Abstract, ¶25, 31-37 discloses extracting semantic concepts from query, labeling image with metadata/tags for matching, and similarity/confidence scoring for re-ranking. Transformer embeddings, as taught by Brdiczka, is a predictable, standard technique for better semantic matching).
Claim 9
Regarding Claim 9, Brdiczka in view of Wu teaches The data processing system of claim 1, wherein the memory further includes instructions configured to cause the processor alone or in combination with other processors to perform operations of:
determining that the retrieval engine did not identify any candidate frame images from among the plurality of frame images (Brdiczka in ¶20 discloses common problem is retrieval may yield no/insufficient results. Obvious conditional fallback that is routine error-handling. Wu ¶35-44 provides support.); and
generating one or more new frame images using a text-to-image model (Brdiczka in Abstract, ¶47-49 discloses generating frame images using a text-to-image model); and
including the one or more new frame images in the set of candidate frame images (Brdiczka in Abstract, ¶47-49 discloses generating frame images using a text-to-image model).
Claim 10
Regarding Claim 10, Brdiczka in view of Wu teaches The data processing system of claim 9, wherein generating the one or more frame images using the text-to-image model further comprises:
selecting a prompt from among a plurality of prompts of a pre-generated prompt datastore based on the one or more key phrases (Brdiczka in ¶20 discloses “target scene generation system … preserves the composition of a target scene by decomposing the natural language description of the target scene. Decomposing the natural language description results in an identification of control language and an identification of descriptive scene language, which allows groups of sub-prompts to be derived.”); and
providing the prompt to the text-to-image generative language model to cause the text-to-image generative language model to generate the one or more new frame images (Brdiczka in ¶20 discloses “target scene generation system … preserves the composition of a target scene by decomposing the natural language description of the target scene. Decomposing the natural language description results in an identification of control language and an identification of descriptive scene language, which allows groups of sub-prompts to be derived.”).
Claim 12
Regarding Claim 12, Brdiczka in view of Wu teaches The data processing system of claim 10, wherein the text-to-image generative language model is a large language model (LLM) (Brdiczka in Abstract, ¶47-49 discloses text-to-image models which use LLM components).
Claim 13
Regarding Claim 13, Brdiczka teaches A data processing system comprising:
a processor (Brdiczka in ¶100 discloses a processor); and
a memory storing executable instructions that, when executed, cause the processor alone or in combination with other processors (Brdiczka in ¶100 discloses memory) to perform operations of:
receiving an electronic copy of an image from an application (Brdiczka in ¶78-80 discloses receiving inputs including a source image which is a previously generated image);
receiving a natural language prompt input by a user of the application requesting that the application generate a digital picture frame for the image (Brdiczka in ¶47-49 discloses natural language prompt with frame and style), 
the natural language prompt including a description of the frame to be created for the image (Brdiczka in ¶47-49 discloses prompt describes a frames size, configuration, style, etc.);
analyzing the natural language prompt using a key-phrase extraction unit to extract one or more key phrases from the natural language prompt that describe a topic of the frame to be generated for the image (Brdiczka in ¶49-51 discloses a control language identifier to analyze a topic of the frame to be generated for the image);
selecting a prompt from among a plurality of prompts of a pre-generated prompt datastore based on the one or more key phrases (Brdiczka in ¶20, 27 discloses sub-prompt may be obtained from databases/data stores for the generative AI module to generate visual elements); 
providing the prompt to a text-to-image generative language model to cause the text-to-image generative language model to generate a set of candidate frame images (Brdiczka in Abstract, ¶49-51 discloses receiving a prompt and a control language identifier to analyze a topic of the frame to be generated for the image);
analyzing the set of candidate frame images using an image placement unit to obtain a set of framed images based on the image and the candidate frame images (Brdiczka in ¶40-42, 62 discloses applying generated frame to source image while applying conventional auto-fit/crop techniques); and
presenting the set of framed images on a user interface of the application (Brdiczka in FIG. 6, ¶64 discloses displaying composed images).
Brdiczka does not explicitly teach all of providing the prompt to a text-to-image generative language model to cause the text-to-image generative language model to generate a set of candidate frame images.
However, Wu teaches providing the prompt to a text-to-image generative language model to cause the text-to-image generative language model to generate a set of candidate frame images (Wu in Abstract, ¶31 discloses a search engine modified to perform increasingly precise image searching using iterative Natural Language (NL) interactions).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Brdiczka by analyzing key phrases to identify candidate frame images from a labeled datastore that is taught by Wu, since both reference are analogous art in the field of natural language processing for image retrieval and composition; thus, one of ordinary skilled in the art would be motivated to combine the references since Brdiczka’s generative framing and source image composition with Wu’s key phrase retrieval from a labeled image datastore yields the predictable result of faster, more efficient frame selection for common or pre-curated styles while maintaining flexibility for novel descriptions.
Thus, the claimed subject matter would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention.
Claim 14
Regarding Claim 14, Brdiczka in view of Wu teaches The data processing system of claim 13, wherein analyzing the set of candidate frame images using the image placement unit to obtain the set of framed images further comprises:
analyzing the image using an object detection model to detect one or more objects in the image and output object information for the one or more objects (Brdiczka in ¶25, 33-35 discloses detecting and outputting information related to objects in an image); and
placing the image in each candidate frame image of the candidate frame images using an image cropping model, the image cropping model being trained to analyze the object information and the candidate frame image to crop the image to fit in the candidate frame image and output a candidate framed image (Brdiczka in ¶40-42, 62-63 discloses applying generated frame to source image while applying conventional auto-fit/crop techniques).
Claim 15
Regarding Claim 15, Brdiczka in view of Wu teaches The data processing system of claim 14, 
wherein the object information comprises a bounding box surrounding the one or more objects in the image (Brdiczka in ¶25 discloses input extractor identifies arrangement/composition of objects in an image. Routine in detection models.), and 
wherein the image cropping model uses the bounding box to fit determine whether to crop the image, the candidate frame, or both (Brdiczka in ¶62-63 discloses cutout/layering fits frame to subject.).
Claim 16
Regarding Claim 16, Brdiczka in view of Wu teaches The data processing system of claim 14, wherein the memory further includes instructions configured to cause the processor alone or in combination with other processors to perform operations of: 
analyzing each framed image of the set of framed images using an image harmonization model trained to adjust attributes of the candidate frame image to harmonize an appearance of the image and an appearance of the candidate frame image included in the framed image (Brdiczka in ¶40-42, 62-63 discloses seamless layering for visual coherence).
Claim 18
Regarding Claim 18, Brdiczka in view of Wu teaches The data processing system of claim 13, wherein the key-phrase extraction unit is implemented using a large language model (LLM), and wherein the text-to-image generative language model is a large language model (LLM) (Brdiczka in Abstract, ¶47-49 discloses NLP parsing and generative AI conditioned on text which use LLM components.).
Claim 19
Regarding Claim 19, Brdiczka teaches A method implemented in a data processing system for a contextually relevant digital picture frame for an image, the method comprising:
receiving an electronic copy of an image from an application (Brdiczka in ¶78-80 discloses receiving inputs including a source image which is a previously generated image);
receiving a natural language prompt input by a user of the application requesting that the application generate a digital picture frame for the image (Brdiczka in ¶47-49 discloses natural language prompt with frame and style), 
the natural language prompt including a description of the frame to be created for the image (Brdiczka in ¶47-49 discloses prompt describes a frames size, configuration, style, etc.);
analyzing the natural language prompt using a key-phrase extraction unit to extract one or more key phrases from the natural language prompt that describe a topic of the frame to be generated for the image (Brdiczka in ¶49-51 discloses a control language identifier to analyze a topic of the frame to be generated for the image);
analyzing the set of candidate frame images using an image placement unit to obtain a set of framed images based on the image and the candidate frame images (Brdiczka in ¶40-42, 62 discloses applying generated frame to source image while applying conventional auto-fit/crop techniques); and
presenting the set of framed images on a user interface of the application (Brdiczka in FIG. 6, ¶64 discloses displaying composed images).
Brdiczka does not explicitly teach all of providing the one or more key phrases as an input to a retrieval engine; analyzing the one or more key phrases with the retrieval engine to identify a set of candidate frame images from among a plurality of frame images in a labeled frame images datastore.
However, Wu teaches providing the one or more key phrases as an input to a retrieval engine (Wu in Abstract, ¶31 discloses a search request includes an NL statement where phrases are analyzed);
analyzing the one or more key phrases with the retrieval engine to identify a set of candidate frame images from among a plurality of frame images in a labeled frame images datastore (Wu in Abstract, ¶31 discloses analyzing phrases to perform a relevance determination for image results)
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Brdiczka by analyzing key phrases to identify candidate frame images from a labeled datastore that is taught by Wu, since both reference are analogous art in the field of natural language processing for image retrieval and composition; thus, one of ordinary skilled in the art would be motivated to combine the references since Brdiczka’s generative framing and source image composition with Wu’s key phrase retrieval from a labeled image datastore yields the predictable result of faster, more efficient frame selection for common or pre-curated styles while maintaining flexibility for novel descriptions.
Thus, the claimed subject matter would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention.
Claim 20
Regarding Claim 20, Brdiczka in view of Wu teaches The method of claim 19, wherein analyzing the set of candidate frame images using the image placement unit to obtain the set of framed images further comprises:
analyzing the image using an object detection model to detect one or more objects in the image and output object information for the one or more objects (Brdiczka in ¶25, 33-35 discloses detecting and outputting information related to objects in an image); and
placing the image in each candidate frame image of the candidate frame images using an image cropping model, the image cropping model being trained to analyze the object information and the candidate frame image to crop the image to fit in the candidate frame image and output a candidate framed image (Brdiczka in ¶40-42, 62-63 discloses applying generated frame to source image while applying conventional auto-fit/crop techniques).

Claim(s) 11, 17 is/are rejected under 35 U.S.C. 103 as obvious over Brdiczka et al (US 20240127511 A1, hereafter referred to as Brdiczka) in view of Wu et al (US 20210224315 A1, hereafter referred to as Wu), further in view of Boyd et al (US 20250131623 A1, hereafter referred to as Boyd).
Claim 11
Regarding Claim 11, Brdiczka in view of Wu teaches The data processing system of claim 10.
Brdiczka in view of Wu does not explicitly teach all of wherein the memory further includes instructions configured to cause the processor alone or in combination with other processors to perform operations of: analyzing each frame image of the one or more new frame images using a moderation service; and discarding a respective frame image of the one or more new frame images responsive to the moderation service determining that the respective frame image includes potentially offensive content.
However, Boyd teaches wherein the memory further includes instructions configured to cause the processor alone or in combination with other processors to perform operations of:
analyzing each frame image of the one or more new frame images using a moderation service (Boyd in ¶190 discloses “processing the automatically modified image by a safety check model”); and 
discarding a respective frame image of the one or more new frame images responsive to the moderation service determining that the respective frame image includes potentially offensive content (Boyd in ¶190 discloses “The content item that was automatically modified is only made available if the modified content item passes the safety check model.”)
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Brdiczka in view of Wu by incorporating a safety check model to analyze images and discard unwanted content that is taught by Boyd, since both reference are analogous art in the field of generative AI image processing; thus, one of ordinary skilled in the art would be motivated to combine the references since Brdiczka in view of Wu’s generative framing with Boyd’s safety check model yields the predictable result of ensuring only non-offensive content is present.
Thus, the claimed subject matter would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention.
Claim 17
Regarding Claim 17, Brdiczka in view of Wu teaches The data processing system of claim 13.
Brdiczka in view of Wu does not explicitly teach all of wherein the memory further includes instructions configured to cause the processor alone or in combination with other processors to perform operations of: analyzing each candidate frame image of the set of candidate frame images using a moderation service; and discarding each candidate frame image of the set of candidate frame images responsive to the moderation service determining that the candidate frame image includes potentially offensive content.
However, Boyd teaches wherein the memory further includes instructions configured to cause the processor alone or in combination with other processors to perform operations of:
analyzing each candidate frame image of the set of candidate frame images using a moderation service (Boyd in ¶190 discloses “processing the automatically modified image by a safety check model”); and 
discarding each candidate frame image of the set of candidate frame images responsive to the moderation service determining that the candidate frame image includes potentially offensive content (Boyd in ¶190 discloses “The content item that was automatically modified is only made available if the modified content item passes the safety check model.”).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Brdiczka in view of Wu by incorporating a safety check model to analyze images and discard unwanted content that is taught by Boyd, since both reference are analogous art in the field of generative AI image processing; thus, one of ordinary skilled in the art would be motivated to combine the references since Brdiczka in view of Wu’s generative framing with Boyd’s safety check model yields the predictable result of ensuring only non-offensive content is present.
Thus, the claimed subject matter would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention.

Allowable Subject Matter
Claims 6-8 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JUSTIN P CASCAIS whose telephone number is (703) 756-5576. The examiner can normally be reached Monday-Friday 8:00-4:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Mr. O'Neal Mistry can be reached on (313) 446-4912. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/J.P.C./Examiner, Art Unit 2674


/ONEAL R MISTRY/Supervisory Patent Examiner, Art Unit 2674                                                                                                                                                                                                                                                                                                                                                                                                         

Date: 1/30/2025
Read full office action
Prosecution Timeline

Dec 26, 2023
Application Filed
Jan 30, 2026
Non-Final Rejection — §103
Feb 22, 2026
Interview Requested
Mar 03, 2026
Examiner Interview Summary
Apr 13, 2026
Response Filed
Precedent Cases

Applications granted by this same examiner with similar technology

18/250,512
Patent 12614290
OBJECT TRACKING DEVICE AND OBJECT TRACKING METHOD
3y 0m to grant Granted Apr 28, 2026
18/080,847
Patent 12597145
MEASURING METHOD AND SYSTEM FOR BODY-SHAPED DATA
3y 3m to grant Granted Apr 07, 2026
17/987,060
Patent 12586362
METHOD AND APPARATUS WITH MULTI-MODAL FEATURE FUSION
3y 4m to grant Granted Mar 24, 2026
18/500,395
Patent 12579685
SYSTEM AND METHOD FOR PERFORMING A CAMERA TO GROUND ALIGNMENT FOR A VEHICLE
2y 4m to grant Granted Mar 17, 2026
18/409,697
Patent 12573178
BRAIN IMAGE CLASSIFICATION METHOD BASED ON DISCRETIZED DATA
2y 2m to grant Granted Mar 10, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

1-2
Expected OA Rounds
71%
Grant Probability
87%
With Interview (+15.8%)
2y 10m (~6m remaining)
Median Time to Grant
Low
PTA Risk
Based on 49 resolved cases by this examiner. Grant probability derived from career allowance rate.