Prosecution Insights
Last updated: May 29, 2026
Application No. 18/157,100

INFORMATION PROCESSING APPARATUS, LEARNING APPARATUS, IMAGE RECOGNITION APPARATUS, INFORMATION PROCESSING METHOD, LEARNING METHOD, IMAGE RECOGNITION METHOD, AND NON-TRANSITORY-COMPUTER-READABLE STORAGE MEDIUM

Non-Final OA §103
Filed
Jan 20, 2023
Priority
Jan 27, 2022 — JP 2022-011140
Examiner
VANCHY JR, MICHAEL J
Art Unit
2666
Tech Center
2600 — Communications
Assignee
Canon Kabushiki Kaisha
OA Round
1 (Non-Final)
67%
Grant Probability
Favorable
1-2
OA Rounds
0m
Est. Remaining
87%
With Interview

Examiner Intelligence

Grants 67% — above average
67%
Career Allowance Rate
406 granted / 608 resolved
+4.8% vs TC avg
Strong +20% interview lift
Without
With
+20.1%
Interview Lift
resolved cases with interview
Typical timeline
3y 3m
Avg Prosecution
10 currently pending
Career history
625
Total Applications
across all art units

Statute-Specific Performance

§101
2.1%
-37.9% vs TC avg
§103
92.9%
+52.9% vs TC avg
§102
2.6%
-37.4% vs TC avg
§112
1.2%
-38.8% vs TC avg
Black line = Tech Center average estimate • Based on career data from 608 resolved cases

Office Action

§103
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Claim Interpretation 112(f) The following is a quotation of 35 U.S.C. 112(f): (f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. The following is a quotation of pre-AIA 35 U.S.C. 112, sixth paragraph: An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier. Such claim limitation(s) is/are: As to claims 1-15 and 21-25, the “generation units” are considered to read on a computer that executes the computer program corresponding to each of the functional units, thereby performing the function of the functional unit; wherein the functional units may be implemented by hardware (Fig. 2; [0052]). As to claims 10 and 11, the “acquisition units” are considered to read on a computer that executes the computer program corresponding to each of the functional units, thereby performing the function of the functional unit; wherein the functional units may be implemented by hardware (Fig. 2; [0052]). As to claim 11, the “identification units” are considered to read on a computer that executes the computer program corresponding to each of the functional units, thereby performing the function of the functional unit; wherein the functional units may be implemented by hardware (Fig. 2; [0052]). As to claims 12-15 and 22-25, the “learning units” are considered to read on a computer that executes the computer program corresponding to each of the functional units, thereby performing the function of the functional unit; wherein the functional units may be implemented by hardware (Fig. 2; [0052]). As to claims 12-15, 17-20, and 22-25, the “detection units” are considered to read on a computer that executes the computer program corresponding to each of the functional units, thereby performing the function of the functional unit; wherein the functional units may be implemented by hardware (Fig. 2; [0052]). As to claims 15 and 25, the “formation units” are considered to read on a computer that executes the computer program corresponding to each of the functional units, thereby performing the function of the functional unit; wherein the functional units may be implemented by hardware (Fig. 2; [0052]). Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof. If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, applicant may: (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph. Claim Rejections - 35 USC § 103 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows: 1. Determining the scope and contents of the prior art. 2. Ascertaining the differences between the prior art and the claims at issue. 3. Resolving the level of ordinary skill in the pertinent art. 4. Considering objective evidence present in the application indicating obviousness or nonobviousness. Claim(s) 1-10 and 12-25 are rejected under 35 U.S.C. 103 as being unpatentable over Marino et al., US 2017/0278289 A1 (Marino), and further in view of Quinton et al., US 2022/0327811 A1 (Quinton). Regarding claim 1, Marino teaches an information processing apparatus (apparatus and system for integrating source digital content with target digital content) ([0002-0003]), comprising: a first generation unit (wherein content integration system 100 can use a content integration module 120) (Fig. 1; [0116]) configured to generate a synthesized image in which a second image is synthesized in a closed region in a first image (wherein the content integration module 120 can overlay or place source digital content (second image) onto the detected host region (closed region) of the target digital content (first image)) (Figs. 1 and 3A-3U; [0116]); and a second generation unit (host region identification module 110) (Fig. 1; [0076-0077]) configured to generate learning data (machine learning system for generating training set) ([0077]), the learning data including a label (the learning data including a location for the second image; such as a bounding box) (Figs. 3C, 3N, 3O, and 3S; [0077]), the label indicating an object region including a region corresponding to the closed region in the synthesized image (the label indicating a region for the closed region for where the second image will be placed to generate the synthesized image) (Figs. 3C, 3N, 3O, and 3S; [0077], [0117], and [0119-0120]). However, Marino does not explicitly teach that the “synthesized image” is used within the learning data. Quinton teaches systems and methods for generating composite based data for use in machine learning systems (Abstract); wherein generating composite data comprising the desired label of a response entry and image data corresponding to the fragment of the composite image (Abstract); and wherein composite images can be used as composite training data ([0067]). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Marino to include the synthesized/composite image as part of the learning data since it improves the efficiency of generating training data and the training process (Quinton; [0011] and [0058]). Regarding claim 2, Marino teaches wherein the first generation unit (wherein content integration system 100 can use a content integration module 120) (Fig. 1; [0116]) acquires an image having a texture as the second image (acquiring texture of the source digital content so as to be able to recreate the luminance (and thus the texture and luminance changes) of the target digital content in the source digital content) ([0088] and [0344]), and the first generation unit (wherein content integration system 100 can use a content integration module 120) (Fig. 1; [0116]) generates a synthesized image in which the second image is synthesized in the closed region in the first image (wherein the content integration module 120 can overlay or place source digital content (second image) onto the detected host region (closed region) of the target digital content (first image)) (Figs. 1 and 3A-3U; [0116]). Regarding claim 3, Marino teaches wherein the first generation unit (content integration system 100) (Fig. 1; [0116]) generates a closed region using a geometric figure (generating a closed region using a bounding box; either a 2D or 3D bounding box) (Figs. 3C, 3O, and 3S; [0117] and [0119-0120]), sets the generated closed region on the first image (generating the closed region on the target digital content) ([0116]), and generates a synthesized image in which the second image is synthesized in the closed region (wherein the content integration module 120 can overlay or place source digital content (second image) onto the detected host region (closed region) of the target digital content (first image)) (Figs. 1 and 3A-3U; [0116-0117] and [0119-0120]). Regarding claim 4, Marino teaches wherein the first generation unit (wherein content integration system 100 can use a content integration module 120) (Fig. 1; [0116]) generates a synthesized image in which the second image is synthesized in a two-dimensional projection region (wherein the source digital content (second image) can be placed/synthesized into the target digital content (first image); wherein the first image area (host region) is a 2D surface such as a wall) ([0343]) in which a virtual object having a three-dimensional shape is projected on the first image (wherein the synthesized image can include the source digital content (second image) is a 3D shape/object projected on the host region of the target digital content (first image)) ([0343-344]). Regarding claim 5, Marino teaches wherein the first generation unit (wherein content integration system 100 can use a content integration module 120) (Fig. 1; [0116]) generates a synthesized image in which the second image is synthesized in a closed region set in the first image (wherein the content integration module 120 can overlay or place source digital content (second image) onto the detected host region (closed region) of the target digital content (first image)) (Figs. 1 and 3A-3U; [0116]) in response to an operation by a user (wherein a user can select the area as the host region within the target digital content (first image)) (Fig. 17B; [0068-0069] and [0279-0280]). Regarding claim 6, Marino teaches wherein the first generation unit (wherein content integration system 100 can use a content integration module 120) (Fig. 1; [0116]) generates a synthesized image in which the second image is synthesized in a closed region (wherein the content integration module 120 can overlay or place source digital content (second image) onto the detected host region (closed region) of the target digital content (first image)) (Figs. 1 and 3N-3Q; [0116] and [0119]) surrounding a contour of an object in the first image (wherein the bounding box surrounds the contour of the detected marker) (Figs. 3N-3Q; [0119]). Regarding claim 7, Marino teaches wherein the first generation unit (wherein content integration system 100 can use a content integration module 120) (Fig. 1; [0116]) generates a synthesized image in which the second image is synthesized in each closed region (wherein one or multiple host regions (with bounding boxes) can be identified) ([0066-0067]) in the first image (wherein the content integration module 120 can overlay or place source digital content (second image) onto the detected host region (closed region) of the target digital content (first image)) (Figs. 1 and 3A-3U; [0116]). Regarding claim 8, Marino teaches wherein the first generation unit (wherein content integration system 100 can use a content integration module 120) (Fig. 1; [0116]) generates a synthesized image in which a plurality of the second images are synthesized in the closed region in the first image (wherein the content integration module 120 can overlay or place a plurality of source digital content (second images) onto the detected host region (closed region) of the target digital content (first image)) (Figs. 1 and 3A-3M; [0116-0118]). Regarding claim 9, Marino teaches wherein the second generation unit (host region identification module 110) (Fig. 1; [0076-0077]) generates learning data (machine learning system for generating training set) ([0077]), the learning data includes the label (the learning data including a location for the second image; such as a bounding box) (Figs. 3C, 3N, 3O, and 3S; [0077]), and a texture label (acquiring texture of the source digital content so as to be able to recreate the luminance (and thus the texture and luminance changes) of the target digital content in the source digital content) ([0088] and [0344]), and the texture label indicates a region having a texture in the closed region in the synthesized image (wherein the texture label indicates the texture within the closed region (bounding box) for the synthesized image) ([0088], [0095-0096], [0098], [0248], and [0344-0348]). However, Marino does not explicitly teach that the “synthesized image” is used within the learning data. Quinton teaches systems and methods for generating composite based data for use in machine learning systems (Abstract); wherein generating composite data comprising the desired label of a response entry and image data corresponding to the fragment of the composite image (Abstract); and wherein composite images can be used as composite training data ([0067]). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Marino to include the synthesized/composite image as part of the learning data since it improves the efficiency of generating training data and the training process (Quinton; [0011] and [0058]). Regarding claim 10, Marino teaches comprising an acquisition unit configured to acquire the second image (wherein the content integration system 100 can receive the source digital content (second image)) (Fig. 1; [0116]), the second image being formed by cutting out a portion including a texture pattern (cropping the digital content including the texture in the content) ([0077] and [0318]) in a shape same as a shape of the closed region (wherein the source digital content can match the shape of the associated host region) ([0436-0437]) from a third image including the texture pattern (acquiring texture of the source digital content so as to be able to recreate the luminance (and thus the texture and luminance changes) of the target digital content in the source digital content) ([0088], [0114-0115], and [0344-0348]). Marino also teaches wherein this gives the eventual placement of the source digital content a more immersed and realistic feel, improving viewer experience ([0343]). Regarding claim 12, Marino teaches a learning apparatus (apparatus with a neural network machine learning model) ([0012]), comprising a learning unit (machine learning model) ([0066-0067]) configured to perform learning of a detection unit that detects an object region from an input image included in learning data (wherein the learning data includes learning a host region within the image, such as based on texture) ([0066-0067] and [0075-0077]) generated by a second generation unit (host region identification module 110) (Fig. 1; [0076-0077]) of an information processing apparatus (apparatus and system for integrating source digital content with target digital content) ([0002-0003]) and a label included in the learning data (the learning data including a location for the second image; such as a bounding box) (Figs. 3C, 3N, 3O, and 3S; [0077]), wherein the information processing apparatus (apparatus and system for integrating source digital content with target digital content) ([0002-0003]) includes: a first generation unit (wherein content integration system 100 can use a content integration module 120) (Fig. 1; [0116]) configured to generate the synthesized image in which a second image is synthesized in a closed region in a first image (wherein the content integration module 120 can overlay or place source digital content (second image) onto the detected host region (closed region) of the target digital content (first image)) (Figs. 1 and 3A-3U; [0116]); and the second generation unit (host region identification module 110) (Fig. 1; [0076-0077]) configured to generate the learning data (machine learning system for generating training set) ([0077]), the learning data including the label (the learning data including a location for the second image; such as a bounding box) (Figs. 3C, 3N, 3O, and 3S; [0077]), the label indicating the object region including a region corresponding to the closed region in the synthesized image (the label indicating a region for the closed region for where the second image will be placed to generate the synthesized image) (Figs. 3C, 3N, 3O, and 3S; [0077], [0117], and [0119-0120]). However, Marino does not explicitly teach that the “synthesized image” is used within the learning data. Quinton teaches systems and methods for generating composite based data for use in machine learning systems (Abstract); wherein generating composite data comprising the desired label of a response entry and image data corresponding to the fragment of the composite image (Abstract); and wherein composite images can be used as composite training data ([0067]). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Marino to include the synthesized/composite image as part of the learning data since it improves the efficiency of generating training data and the training process (Quinton; [0011] and [0058]). Regarding claim 13, Marino teaches an image recognition apparatus (apparatus and system for integrating source digital content with target digital content) ([0002-0003]), comprising a detection unit configured to detect an object region (detecting a host region within the image) ([0066-0067] and [0075-0077]) from an input image using a detection unit learned by a learning apparatus (wherein the learning data includes learning a host region within the image, such as based on texture) ([0066-0067] and [0075-0077]) that includes learning unit (machine learning model) ([0066-0067]), the learning unit (machine learning model) ([0066-0067]) performing learning of the detection unit that detects the object region from the input image included in learning data (wherein the learning data includes learning a host region within the image, such as based on texture) ([0066-0067] and [0075-0077]) generated by a second generation unit (host region identification module 110) (Fig. 1; [0076-0077]) of an information processing apparatus (apparatus and system for integrating source digital content with target digital content) ([0002-0003]) and a label included in the learning data (the learning data including a location for the second image; such as a bounding box) (Figs. 3C, 3N, 3O, and 3S; [0077]), wherein the information processing apparatus (apparatus and system for integrating source digital content with target digital content) ([0002-0003]) includes: a first generation unit (wherein content integration system 100 can use a content integration module 120) (Fig. 1; [0116]) configured to generate the synthesized image in which a second image is synthesized in a closed region in a first image (wherein the content integration module 120 can overlay or place source digital content (second image) onto the detected host region (closed region) of the target digital content (first image)) (Figs. 1 and 3A-3U; [0116]); and the second generation unit (host region identification module 110) (Fig. 1; [0076-0077]) configured to generate the learning data (machine learning system for generating training set) ([0077]), the learning data including the label (the learning data including a location for the second image; such as a bounding box) (Figs. 3C, 3N, 3O, and 3S; [0077]), the label indicating the object region including a region corresponding to the closed region in the synthesized image (the label indicating a region for the closed region for where the second image will be placed to generate the synthesized image) (Figs. 3C, 3N, 3O, and 3S; [0077], [0117], and [0119-0120]). However, Marino does not explicitly teach that the “synthesized image” is used within the learning data. Quinton teaches systems and methods for generating composite based data for use in machine learning systems (Abstract); wherein generating composite data comprising the desired label of a response entry and image data corresponding to the fragment of the composite image (Abstract); and wherein composite images can be used as composite training data ([0067]). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Marino to include the synthesized/composite image as part of the learning data since it improves the efficiency of generating training data and the training process (Quinton; [0011] and [0058]). Regarding claim 14, Marino teaches a learning apparatus (apparatus with a neural network machine learning model) ([0012]), comprising a learning unit (machine learning model) ([0066-0067]) configured to perform learning of a first detection unit and a second detection unit (wherein the host region identification module 110 includes a plurality of host region identification sub-modules) (Fig. 1; [0096]) included in learning data (wherein the learning data includes learning a host region within the image, such as based on texture) ([0066-0067] and [0075-0077]) generated by a second generation unit (host region identification module 110) (Fig. 1; [0076-0077]) of an information processing apparatus (apparatus and system for integrating source digital content with target digital content) ([0002-0003]), a label included in the learning data (the learning data including a location for the second image; such as a bounding box) (Figs. 3C, 3N, 3O, and 3S; [0077]), and a texture label included in the learning data (acquiring texture of the source digital content so as to be able to recreate the luminance (and thus the texture and luminance changes) of the target digital content in the source digital content) ([0088] and [0344]), the first detection unit (wherein the host region identification module 110 includes a plurality of host region identification sub-modules) (Fig. 1; [0096]) detecting an object region from an input image (wherein the learning data includes learning a host region within the image) ([0066-0067] and [0075-0077]), the second detection unit (wherein the host region identification module 110 includes a plurality of host region identification sub-modules) (Fig. 1; [0096]) detecting a region having a texture from the input image (wherein the learning data includes learning a host region within the image, such as based on texture) ([0066-0067] and [0075-0077]), wherein the information processing apparatus (apparatus and system for integrating source digital content with target digital content) ([0002-0003]) includes: a first generation unit (wherein content integration system 100 can use a content integration module 120) (Fig. 1; [0116]) configured to generate the synthesized image in which a second image is synthesized in a closed region in a first image (wherein the content integration module 120 can overlay or place source digital content (second image) onto the detected host region (closed region) of the target digital content (first image)) (Figs. 1 and 3A-3U; [0116]); and the second generation unit (host region identification module 110) (Fig. 1; [0076-0077]) configured to generate the learning data (machine learning system for generating training set) ([0077]), the learning data including the label (the learning data including a location for the second image; such as a bounding box) (Figs. 3C, 3N, 3O, and 3S; [0077]), the label indicating the object region including a region corresponding to the closed region in the synthesized image (the label indicating a region for the closed region for where the second image will be placed to generate the synthesized image) (Figs. 3C, 3N, 3O, and 3S; [0077], [0117], and [0119-0120]), wherein the second generation unit (host region identification module 110) (Fig. 1; [0076-0077]) generates the learning data (machine learning system for generating training set) ([0077]) including the label (the learning data including a location for the second image; such as a bounding box) (Figs. 3C, 3N, 3O, and 3S; [0077]), and the texture label (acquiring texture of the source digital content so as to be able to recreate the luminance (and thus the texture and luminance changes) of the target digital content in the source digital content) ([0088] and [0344]) indicating a region having the texture in the closed region in the synthesized image (wherein the texture label indicates the texture within the closed region (bounding box) for the synthesized image) ([0088], [0095-0096], [0098], [0248], and [0344-0348]). However, Marino does not explicitly teach that the “synthesized image” is used within the learning data. Quinton teaches systems and methods for generating composite based data for use in machine learning systems (Abstract); wherein generating composite data comprising the desired label of a response entry and image data corresponding to the fragment of the composite image (Abstract); and wherein composite images can be used as composite training data ([0067]). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Marino to include the synthesized/composite image as part of the learning data since it improves the efficiency of generating training data and the training process (Quinton; [0011] and [0058]). Regarding claim 15, Marino teaches an image recognition apparatus (image recognition apparatus) ([0018]), comprising a formation unit configured to form a new object region (forming a host region by forming a bounding box) (Figs. 3N and 3O; [0119]) using an object region detected from an input image (detecting a host region within the image) ([0066-0067] and [0075-0077]) using a first detection unit (wherein the host region identification module 110 includes a plurality of host region identification sub-modules) (Fig. 1; [0096]) learned (wherein the learning data includes learning a host region within the image) ([0066-0067] and [0075-0077]) by a learning apparatus (apparatus with a neural network machine learning model) ([0012]) and a texture region detected from the input image (wherein the learning data includes learning a host region within the image, such as based on texture) ([0066-0067] and [0075-0077]) using a second detection unit (wherein the host region identification module 110 includes a plurality of host region identification sub-modules) (Fig. 1; [0096]) learned by the learning apparatus (apparatus with a neural network machine learning model) ([0012]), the learning apparatus (apparatus with a neural network machine learning model) ([0012]) including a learning unit (machine learning model) ([0066-0067]) configured to perform learning of the first detection unit and the second detection unit (wherein the host region identification module 110 includes a plurality of host region identification sub-modules) (Fig. 1; [0096]) included in learning data (wherein the learning data includes learning a host region within the image, such as based on texture) ([0066-0067] and [0075-0077]) generated by a second generation unit (host region identification module 110) (Fig. 1; [0076-0077]) of an information processing apparatus (apparatus and system for integrating source digital content with target digital content) ([0002-0003]), a label included in the learning data (the learning data including a location for the second image; such as a bounding box) (Figs. 3C, 3N, 3O, and 3S; [0077]), and a texture label included in the learning data (acquiring texture of the source digital content so as to be able to recreate the luminance (and thus the texture and luminance changes) of the target digital content in the source digital content) ([0088] and [0344]), the first detection unit (wherein the host region identification module 110 includes a plurality of host region identification sub-modules) (Fig. 1; [0096]) detecting the object region from the input image (wherein the learning data includes learning a host region within the image) ([0066-0067] and [0075-0077]), the second detection unit (wherein the host region identification module 110 includes a plurality of host region identification sub-modules) (Fig. 1; [0096]) detecting a region having a texture from the input image (wherein the learning data includes learning a host region within the image, such as based on texture) ([0066-0067] and [0075-0077]), wherein the information processing apparatus (apparatus and system for integrating source digital content with target digital content) ([0002-0003]) includes: a first generation unit (wherein content integration system 100 can use a content integration module 120) (Fig. 1; [0116]) configured to generate the synthesized image in which a second image is synthesized in a closed region in a first image (wherein the content integration module 120 can overlay or place source digital content (second image) onto the detected host region (closed region) of the target digital content (first image)) (Figs. 1 and 3A-3U; [0116]); and the second generation unit (host region identification module 110) (Fig. 1; [0076-0077]) configured to generate the learning data (machine learning system for generating training set) ([0077]), the learning data including the label (the learning data including a location for the second image; such as a bounding box) (Figs. 3C, 3N, 3O, and 3S; [0077]), the label indicating the object region including a region corresponding to the closed region in the synthesized image (the label indicating a region for the closed region for where the second image will be placed to generate the synthesized image) (Figs. 3C, 3N, 3O, and 3S; [0077], [0117], and [0119-0120]), wherein the second generation unit (host region identification module 110) (Fig. 1; [0076-0077]) generates the learning data (machine learning system for generating training set) ([0077]) including the label (the learning data including a location for the second image; such as a bounding box) (Figs. 3C, 3N, 3O, and 3S; [0077]), and the texture label (acquiring texture of the source digital content so as to be able to recreate the luminance (and thus the texture and luminance changes) of the target digital content in the source digital content) ([0088] and [0344]) indicating a region having the texture in the closed region in the synthesized image (wherein the texture label indicates the texture within the closed region (bounding box) for the synthesized image) ([0088], [0095-0096], [0098], [0248], and [0344-0348]). However, Marino does not explicitly teach that the “synthesized image” is used within the learning data. Quinton teaches systems and methods for generating composite based data for use in machine learning systems (Abstract); wherein generating composite data comprising the desired label of a response entry and image data corresponding to the fragment of the composite image (Abstract); and wherein composite images can be used as composite training data ([0067]). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Marino to include the synthesized/composite image as part of the learning data since it improves the efficiency of generating training data and the training process (Quinton; [0011] and [0058]). Regarding claim 16, Marino teaches an information processing method performed by an information processing apparatus (apparatus, systems, and methods for integrating source digital content with target digital content) ([0002-0004]), the method comprising: generating a synthesized image in which a second image is synthesized in a closed region in a first image (wherein the content integration module 120 can overlay or place source digital content (second image) onto the detected host region (closed region) of the target digital content (first image)) (Figs. 1 and 3A-3U; [0116]); and generating learning data (machine learning system for generating training set) ([0077]) including a label (the learning data including a location for the second image; such as a bounding box) (Figs. 3C, 3N, 3O, and 3S; [0077]), the label indicating an object region including a region corresponding to the closed region in the synthesized image (the label indicating a region for the closed region for where the second image will be placed to generate the synthesized image) (Figs. 3C, 3N, 3O, and 3S; [0077], [0117], and [0119-0120]). However, Marino does not explicitly teach that the “synthesized image” is used within the learning data. Quinton teaches systems and methods for generating composite based data for use in machine learning systems (Abstract); wherein generating composite data comprising the desired label of a response entry and image data corresponding to the fragment of the composite image (Abstract); and wherein composite images can be used as composite training data ([0067]). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Marino to include the synthesized/composite image as part of the learning data since it improves the efficiency of generating training data and the training process (Quinton; [0011] and [0058]). Regarding claim 17, Marino teaches a learning method performed by a learning apparatus (apparatus with a neural network machine learning model) ([0012]), comprising performing learning (machine learning model) ([0066-0067]) of a detection unit that detects an object region from an input image included in learning data (wherein the learning data includes learning a host region within the image, such as based on texture) ([0066-0067] and [0075-0077]) generated in an information processing method (apparatus, systems, and methods for integrating source digital content with target digital content) ([0002-0004]) and a label included in the learning data (the learning data including a location for the second image; such as a bounding box) (Figs. 3C, 3N, 3O, and 3S; [0077]), wherein the information processing method (apparatus, systems, and methods for integrating source digital content with target digital content) ([0002-0004]) includes: generating the synthesized image in which a second image is synthesized in a closed region in a first image (wherein the content integration module 120 can overlay or place source digital content (second image) onto the detected host region (closed region) of the target digital content (first image)) (Figs. 1 and 3A-3U; [0116]); and generating the learning data (machine learning system for generating training set) ([0077]) including the label (the learning data including a location for the second image; such as a bounding box) (Figs. 3C, 3N, 3O, and 3S; [0077]), the label indicating the object region including a region corresponding to the closed region in the synthesized image (the label indicating a region for the closed region for where the second image will be placed to generate the synthesized image) (Figs. 3C, 3N, 3O, and 3S; [0077], [0117], and [0119-0120]). However, Marino does not explicitly teach that the “synthesized image” is used within the learning data. Quinton teaches systems and methods for generating composite based data for use in machine learning systems (Abstract); wherein generating composite data comprising the desired label of a response entry and image data corresponding to the fragment of the composite image (Abstract); and wherein composite images can be used as composite training data ([0067]). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Marino to include the synthesized/composite image as part of the learning data since it improves the efficiency of generating training data and the training process (Quinton; [0011] and [0058]). Regarding claim 18, Marino teaches an image recognition method performed by an image recognition apparatus (apparatus, systems, and methods for integrating source digital content with target digital content) ([0002-0004]), comprising detecting an object region (detecting a host region within the image) ([0066-0067] and [0075-0077]) from an input image using a detection unit learned by a learning method (wherein the learning data includes learning a host region within the image, such as based on texture) ([0066-0067] and [0075-0077]) included in learning data (wherein the learning data includes learning a host region within the image, such as based on texture) ([0066-0067] and [0075-0077]) generated in an information processing method (apparatus, systems, and methods for integrating source digital content with target digital content) ([0002-0004]) and a label included in the learning data (the learning data including a location for the second image; such as a bounding box) (Figs. 3C, 3N, 3O, and 3S; [0077]), the learning method performing learning of the detection unit that detects the object region from the input image (wherein the learning data includes learning a host region within the image, such as based on texture) ([0066-0067] and [0075-0077]), wherein the information processing method (apparatus, systems, and methods for integrating source digital content with target digital content) ([0002-0004]) includes: generating the synthesized image in which a second image is synthesized in a closed region in a first image (wherein the content integration module 120 can overlay or place source digital content (second image) onto the detected host region (closed region) of the target digital content (first image)) (Figs. 1 and 3A-3U; [0116]); and generating the learning data (machine learning system for generating training set) ([0077]) including the label (the learning data including a location for the second image; such as a bounding box) (Figs. 3C, 3N, 3O, and 3S; [0077]), the label indicating the object region including a region corresponding to the closed region in the synthesized image (the label indicating a region for the closed region for where the second image will be placed to generate the synthesized image) (Figs. 3C, 3N, 3O, and 3S; [0077], [0117], and [0119-0120]). However, Marino does not explicitly teach that the “synthesized image” is used within the learning data. Quinton teaches systems and methods for generating composite based data for use in machine learning systems (Abstract); wherein generating composite data comprising the desired label of a response entry and image data corresponding to the fragment of the composite image (Abstract); and wherein composite images can be used as composite training data ([0067]). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Marino to include the synthesized/composite image as part of the learning data since it improves the efficiency of generating training data and the training process (Quinton; [0011] and [0058]). Regarding claim 19, Marino teaches a learning method performed by a learning apparatus (apparatus, systems, and methods for integrating source digital content with target digital content) ([0002-0004]), comprising performing learning of a first detection unit and a second detection unit (wherein the host region identification module 110 includes a plurality of host region identification sub-modules) (Fig. 1; [0096]) included in learning data (wherein the learning data includes learning a host region within the image, such as based on texture) ([0066-0067] and [0075-0077]) generated in an information processing method (apparatus, systems, and methods for integrating source digital content with target digital content) ([0002-0004]), a label included in the learning data (the learning data including a location for the second image; such as a bounding box) (Figs. 3C, 3N, 3O, and 3S; [0077]), and a texture label included in the learning data (acquiring texture of the source digital content so as to be able to recreate the luminance (and thus the texture and luminance changes) of the target digital content in the source digital content) ([0088] and [0344]), the first detection unit (wherein the host region identification module 110 includes a plurality of host region identification sub-modules) (Fig. 1; [0096]) detecting an object region from an input image (wherein the learning data includes learning a host region within the image) ([0066-0067] and [0075-0077]), the second detection unit (wherein the host region identification module 110 includes a plurality of host region identification sub-modules) (Fig. 1; [0096]) detecting a region having a texture from the input image (wherein the learning data includes learning a host region within the image, such as based on texture) ([0066-0067] and [0075-0077]), wherein the information processing method (apparatus, systems, and methods for integrating source digital content with target digital content) ([0002-0004]) includes: generating the synthesized image in which a second image is synthesized in a closed region in a first image (wherein the content integration module 120 can overlay or place source digital content (second image) onto the detected host region (closed region) of the target digital content (first image)) (Figs. 1 and 3A-3U; [0116]); and generating the learning data (machine learning system for generating training set) ([0077]) including the label (the learning data including a location for the second image; such as a bounding box) (Figs. 3C, 3N, 3O, and 3S; [0077]), the label indicating the object region including a region corresponding to the closed region in the synthesized image (the label indicating a region for the closed region for where the second image will be placed to generate the synthesized image) (Figs. 3C, 3N, 3O, and 3S; [0077], [0117], and [0119-0120]), wherein the generating generates the learning data (machine learning system for generating training set) ([0077]) including the label (the learning data including a location for the second image; such as a bounding box) (Figs. 3C, 3N, 3O, and 3S; [0077]), and the texture label (acquiring texture of the source digital content so as to be able to recreate the luminance (and thus the texture and luminance changes) of the target digital content in the source digital content) ([0088] and [0344]) indicating a region having the texture in the closed region in the synthesized image (wherein the texture label indicates the texture within the closed region (bounding box) for the synthesized image) ([0088], [0095-0096], [0098], [0248], and [0344-0348]). However, Marino does not explicitly teach that the “synthesized image” is used within the learning data. Quinton teaches systems and methods for generating composite based data for use in machine learning systems (Abstract); wherein generating composite data comprising the desired label of a response entry and image data corresponding to the fragment of the composite image (Abstract); and wherein composite images can be used as composite training data ([0067]). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Marino to include the synthesized/composite image as part of the learning data since it improves the efficiency of generating training data and the training process (Quinton; [0011] and [0058]). Regarding claim 20, Marino teaches an image recognition method performed by an image recognition apparatus (apparatus, systems, and methods for integrating source digital content with target digital content) ([0002-0004]), comprising forming a new object region (forming a host region by forming a bounding box) (Figs. 3N and 3O; [0119]) using an object region detected from an input image (detecting a host region within the image) ([0066-0067] and [0075-0077]) using a first detection unit (wherein the host region identification module 110 includes a plurality of host region identification sub-modules) (Fig. 1; [0096]) learned by a learning method and a texture region detected from the input image using a second detection unit (wherein the host region identification module 110 includes a plurality of host region identification sub-modules) (Fig. 1; [0096]) learned (wherein the learning data includes learning a host region within the image) ([0066-0067] and [0075-0077]) by the learning method (a neural network machine learning model) ([0012]), the learning method performing learning of the first detection unit and the second detection unit (wherein the host region identification module 110 includes a plurality of host region identification sub-modules) (Fig. 1; [0096]) included in learning data (wherein the learning data includes learning a host region within the image, such as based on texture) ([0066-0067] and [0075-0077]) generated in an information processing method (apparatus, systems, and methods for integrating source digital content with target digital content) ([0002-0004]), a label included in the learning data (the learning data including a location for the second image; such as a bounding box) (Figs. 3C, 3N, 3O, and 3S; [0077]), and a texture label included in the learning data (acquiring texture of the source digital content so as to be able to recreate the luminance (and thus the texture and luminance changes) of the target digital content in the source digital content) ([0088] and [0344]), the first detection unit (wherein the host region identification module 110 includes a plurality of host region identification sub-modules) (Fig. 1; [0096]) detecting the object region from the input image (wherein the learning data includes learning a host region within the image) ([0066-0067] and [0075-0077]), the second detection unit (wherein the host region identification module 110 includes a plurality of host region identification sub-modules) (Fig. 1; [0096]) detecting a region having a texture from the input image (wherein the learning data includes learning a host region within the image, such as based on texture) ([0066-0067] and [0075-0077]), wherein the information processing method (apparatus, systems, and methods for integrating source digital content with target digital content) ([0002-0004]) includes: generating the synthesized image in which a second image is synthesized in a closed region in a first image (wherein the content integration module 120 can overlay or place source digital content (second image) onto the detected host region (closed region) of the target digital content (first image)) (Figs. 1 and 3A-3U; [0116]); and generating the learning data (machine learning system for generating training set) ([0077]) including the label (the learning data including a location for the second image; such as a bounding box) (Figs. 3C, 3N, 3O, and 3S; [0077]), the label indicating the object region including a region corresponding to the closed region in the synthesized image (the label indicating a region for the closed region for where the second image will be placed to generate the synthesized image) (Figs. 3C, 3N, 3O, and 3S; [0077], [0117], and [0119-0120]), wherein the generating generates the learning data (machine learning system for generating training set) ([0077]) including the label (the learning data including a location for the second image; such as a bounding box) (Figs. 3C, 3N, 3O, and 3S; [0077]), and the texture label (acquiring texture of the source digital content so as to be able to recreate the luminance (and thus the texture and luminance changes) of the target digital content in the source digital content) ([0088] and [0344]) indicating a region having the texture in the closed region in the synthesized image (wherein the texture label indicates the texture within the closed region (bounding box) for the synthesized image) ([0088], [0095-0096], [0098], [0248], and [0344-0348]). However, Marino does not explicitly teach that the “synthesized image” is used within the learning data. Quinton teaches systems and methods for generating composite based data for use in machine learning systems (Abstract); wherein generating composite data comprising the desired label of a response entry and image data corresponding to the fragment of the composite image (Abstract); and wherein composite images can be used as composite training data ([0067]). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Marino to include the synthesized/composite image as part of the learning data since it improves the efficiency of generating training data and the training process (Quinton; [0011] and [0058]). Regarding claim 21, Marino teaches a non-transitory-computer-readable storage medium storing a computer program to cause a computer (non-transitory computer readable medium having executable instructions to cause a processor to function) ([0005]) to function as: a first generation unit (wherein content integration system 100 can use a content integration module 120) (Fig. 1; [0116]) configured to generate a synthesized image in which a second image is synthesized in a closed region in a first image (wherein the content integration module 120 can overlay or place source digital content (second image) onto the detected host region (closed region) of the target digital content (first image)) (Figs. 1 and 3A-3U; [0116]); and a second generation unit (host region identification module 110) (Fig. 1; [0076-0077]) configured to generate learning data (machine learning system for generating training set) ([0077]), the learning data including a label (the learning data including a location for the second image; such as a bounding box) (Figs. 3C, 3N, 3O, and 3S; [0077]), the label indicating an object region including a region corresponding to the closed region in the synthesized image (the label indicating a region for the closed region for where the second image will be placed to generate the synthesized image) (Figs. 3C, 3N, 3O, and 3S; [0077], [0117], and [0119-0120]). However, Marino does not explicitly teach that the “synthesized image” is used within the learning data. Quinton teaches systems and methods for generating composite based data for use in machine learning systems (Abstract); wherein generating composite data comprising the desired label of a response entry and image data corresponding to the fragment of the composite image (Abstract); and wherein composite images can be used as composite training data ([0067]). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Marino to include the synthesized/composite image as part of the learning data since it improves the efficiency of generating training data and the training process (Quinton; [0011] and [0058]). Regarding claim 22, Marino teaches a non-transitory-computer-readable storage medium storing a computer program to cause a computer (non-transitory computer readable medium having executable instructions to cause a processor to function) ([0005]) to function as a learning unit (machine learning model) ([0066-0067]) of a learning apparatus (apparatus with a neural network machine learning model) ([0012]) configured to perform learning of a detection unit that detects an object region from an input image included in learning data (wherein the learning data includes learning a host region within the image, such as based on texture) ([0066-0067] and [0075-0077]) generated by a second generation unit (host region identification module 110) (Fig. 1; [0076-0077]) of an information processing apparatus (apparatus and system for integrating source digital content with target digital content) ([0002-0003]) and a label included in the learning data (the learning data including a location for the second image; such as a bounding box) (Figs. 3C, 3N, 3O, and 3S; [0077]), wherein the information processing apparatus (apparatus and system for integrating source digital content with target digital content) ([0002-0003]) includes: a first generation unit (wherein content integration system 100 can use a content integration module 120) (Fig. 1; [0116]) configured to generate the synthesized image in which a second image is synthesized in a closed region in a first image (wherein the content integration module 120 can overlay or place source digital content (second image) onto the detected host region (closed region) of the target digital content (first image)) (Figs. 1 and 3A-3U; [0116]); and the second generation unit (host region identification module 110) (Fig. 1; [0076-0077]) configured to generate the learning data (machine learning system for generating training set) ([0077]), the learning data including the label (the learning data including a location for the second image; such as a bounding box) (Figs. 3C, 3N, 3O, and 3S; [0077]), the label indicating the object region including a region corresponding to the closed region in the synthesized image (the label indicating a region for the closed region for where the second image will be placed to generate the synthesized image) (Figs. 3C, 3N, 3O, and 3S; [0077], [0117], and [0119-0120]). However, Marino does not explicitly teach that the “synthesized image” is used within the learning data. Quinton teaches systems and methods for generating composite based data for use in machine learning systems (Abstract); wherein generating composite data comprising the desired label of a response entry and image data corresponding to the fragment of the composite image (Abstract); and wherein composite images can be used as composite training data ([0067]). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Marino to include the synthesized/composite image as part of the learning data since it improves the efficiency of generating training data and the training process (Quinton; [0011] and [0058]). Regarding claim 23, Marino teaches a non-transitory-computer-readable storage medium storing a computer program to cause a computer (non-transitory computer readable medium having executable instructions to cause a processor to function) ([0005]) to function as a learning unit (machine learning model) ([0066-0067]) of a learning apparatus (apparatus with a neural network machine learning model) ([0012]) configured to perform learning of a first detection unit and a second detection unit (wherein the host region identification module 110 includes a plurality of host region identification sub-modules) (Fig. 1; [0096]) included in learning data (wherein the learning data includes learning a host region within the image, such as based on texture) ([0066-0067] and [0075-0077]) generated by a second generation unit (host region identification module 110) (Fig. 1; [0076-0077]) of an information processing apparatus (apparatus and system for integrating source digital content with target digital content) ([0002-0003]), a label included in the learning data (the learning data including a location for the second image; such as a bounding box) (Figs. 3C, 3N, 3O, and 3S; [0077]), and a texture label included in the learning data (acquiring texture of the source digital content so as to be able to recreate the luminance (and thus the texture and luminance changes) of the target digital content in the source digital content) ([0088] and [0344]), the first detection unit (wherein the host region identification module 110 includes a plurality of host region identification sub-modules) (Fig. 1; [0096]) detecting an object region from an input image (wherein the learning data includes learning a host region within the image) ([0066-0067] and [0075-0077]), the second detection unit (wherein the host region identification module 110 includes a plurality of host region identification sub-modules) (Fig. 1; [0096]) detecting a region having a texture from the input image (wherein the learning data includes learning a host region within the image, such as based on texture) ([0066-0067] and [0075-0077]), wherein the information processing apparatus (apparatus and system for integrating source digital content with target digital content) ([0002-0003]) includes: a first generation unit (wherein content integration system 100 can use a content integration module 120) (Fig. 1; [0116]) configured to generate the synthesized image in which a second image is synthesized in a closed region in a first image (wherein the content integration module 120 can overlay or place source digital content (second image) onto the detected host region (closed region) of the target digital content (first image)) (Figs. 1 and 3A-3U; [0116]); and the second generation unit (host region identification module 110) (Fig. 1; [0076-0077]) configured to generate the learning data (machine learning system for generating training set) ([0077]), the learning data including the label (the learning data including a location for the second image; such as a bounding box) (Figs. 3C, 3N, 3O, and 3S; [0077]), the label indicating the object region including a region corresponding to the closed region in the synthesized image (the label indicating a region for the closed region for where the second image will be placed to generate the synthesized image) (Figs. 3C, 3N, 3O, and 3S; [0077], [0117], and [0119-0120]), wherein the second generation unit (host region identification module 110) (Fig. 1; [0076-0077]) generates the learning data (machine learning system for generating training set) ([0077]) including the label (the learning data including a location for the second image; such as a bounding box) (Figs. 3C, 3N, 3O, and 3S; [0077]), and the texture label (acquiring texture of the source digital content so as to be able to recreate the luminance (and thus the texture and luminance changes) of the target digital content in the source digital content) ([0088] and [0344]) indicating a region having the texture in the closed region in the synthesized image (wherein the texture label indicates the texture within the closed region (bounding box) for the synthesized image) ([0088], [0095-0096], [0098], [0248], and [0344-0348]). However, Marino does not explicitly teach that the “synthesized image” is used within the learning data. Quinton teaches systems and methods for generating composite based data for use in machine learning systems (Abstract); wherein generating composite data comprising the desired label of a response entry and image data corresponding to the fragment of the composite image (Abstract); and wherein composite images can be used as composite training data ([0067]). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Marino to include the synthesized/composite image as part of the learning data since it improves the efficiency of generating training data and the training process (Quinton; [0011] and [0058]). Regarding claim 24, Marino teaches a non-transitory-computer-readable storage medium storing a computer program to cause a computer (non-transitory computer readable medium having executable instructions to cause a processor to function) ([0005]) to function as each unit of an image recognition apparatus (apparatus and system for integrating source digital content with target digital content) ([0002-0003] and [0018]), the image recognition apparatus (apparatus and system for integrating source digital content with target digital content) ([0002-0003] and [0018]) comprising a detection unit configured to detect an object region (detecting a host region within the image) ([0066-0067] and [0075-0077]) from an input image using a detection unit learned by a learning apparatus (wherein the learning data includes learning a host region within the image, such as based on texture) ([0066-0067] and [0075-0077]) that includes learning unit (machine learning model) ([0066-0067]), the learning unit (machine learning model) ([0066-0067]) performing learning of the detection unit that detects the object region from the input image included in learning data (wherein the learning data includes learning a host region within the image, such as based on texture) ([0066-0067] and [0075-0077]) generated by a second generation unit (host region identification module 110) (Fig. 1; [0076-0077]) of an information processing apparatus (apparatus and system for integrating source digital content with target digital content) ([0002-0003]) and a label included in the learning data (the learning data including a location for the second image; such as a bounding box) (Figs. 3C, 3N, 3O, and 3S; [0077]), wherein the information processing apparatus (apparatus and system for integrating source digital content with target digital content) ([0002-0003]) includes: a first generation unit (wherein content integration system 100 can use a content integration module 120) (Fig. 1; [0116]) configured to generate the synthesized image in which a second image is synthesized in a closed region in a first image (wherein the content integration module 120 can overlay or place source digital content (second image) onto the detected host region (closed region) of the target digital content (first image)) (Figs. 1 and 3A-3U; [0116]); and the second generation unit (host region identification module 110) (Fig. 1; [0076-0077]) configured to generate the learning data (machine learning system for generating training set) ([0077]), the learning data including the label (the learning data including a location for the second image; such as a bounding box) (Figs. 3C, 3N, 3O, and 3S; [0077]), the label indicating the object region including a region corresponding to the closed region in the synthesized image (the label indicating a region for the closed region for where the second image will be placed to generate the synthesized image) (Figs. 3C, 3N, 3O, and 3S; [0077], [0117], and [0119-0120]). However, Marino does not explicitly teach that the “synthesized image” is used within the learning data. Quinton teaches systems and methods for generating composite based data for use in machine learning systems (Abstract); wherein generating composite data comprising the desired label of a response entry and image data corresponding to the fragment of the composite image (Abstract); and wherein composite images can be used as composite training data ([0067]). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Marino to include the synthesized/composite image as part of the learning data since it improves the efficiency of generating training data and the training process (Quinton; [0011] and [0058]). Regarding claim 25, Marino teaches a non-transitory-computer-readable storage medium storing a computer program to cause a computer (non-transitory computer readable medium having executable instructions to cause a processor to function) ([0005]) to function as each unit of an image recognition apparatus (image recognition apparatus) ([0018]), the image recognition apparatus (image recognition apparatus) ([0018]) comprising a formation unit configured to form a new object region (forming a host region by forming a bounding box) (Figs. 3N and 3O; [0119]) using an object region detected from an input image (detecting a host region within the image) ([0066-0067] and [0075-0077]) using a first detection unit (wherein the host region identification module 110 includes a plurality of host region identification sub-modules) (Fig. 1; [0096]) learned (wherein the learning data includes learning a host region within the image) ([0066-0067] and [0075-0077]) by a learning apparatus (apparatus with a neural network machine learning model) ([0012]) and a texture region detected from the input image (wherein the learning data includes learning a host region within the image, such as based on texture) ([0066-0067] and [0075-0077]) using a second detection unit (wherein the host region identification module 110 includes a plurality of host region identification sub-modules) (Fig. 1; [0096]) learned by the learning apparatus (apparatus with a neural network machine learning model) ([0012]), the learning apparatus (apparatus with a neural network machine learning model) ([0012]) including a learning unit (machine learning model) ([0066-0067]) configured to perform learning of the first detection unit and the second detection unit (wherein the host region identification module 110 includes a plurality of host region identification sub-modules) (Fig. 1; [0096]) included in learning data (wherein the learning data includes learning a host region within the image, such as based on texture) ([0066-0067] and [0075-0077]) generated by a second generation unit (host region identification module 110) (Fig. 1; [0076-0077]) of an information processing apparatus (apparatus and system for integrating source digital content with target digital content) ([0002-0003]), a label included in the learning data (the learning data including a location for the second image; such as a bounding box) (Figs. 3C, 3N, 3O, and 3S; [0077]), and a texture label included in the learning data (acquiring texture of the source digital content so as to be able to recreate the luminance (and thus the texture and luminance changes) of the target digital content in the source digital content) ([0088] and [0344]), the first detection unit (wherein the host region identification module 110 includes a plurality of host region identification sub-modules) (Fig. 1; [0096]) detecting the object region from the input image (wherein the learning data includes learning a host region within the image) ([0066-0067] and [0075-0077]), the second detection unit (wherein the host region identification module 110 includes a plurality of host region identification sub-modules) (Fig. 1; [0096]) detecting a region having a texture from the input image (wherein the learning data includes learning a host region within the image, such as based on texture) ([0066-0067] and [0075-0077]), wherein the information processing apparatus (apparatus and system for integrating source digital content with target digital content) ([0002-0003]) includes: a first generation unit (wherein content integration system 100 can use a content integration module 120) (Fig. 1; [0116]) configured to generate the synthesized image in which a second image is synthesized in a closed region in a first image (wherein the content integration module 120 can overlay or place source digital content (second image) onto the detected host region (closed region) of the target digital content (first image)) (Figs. 1 and 3A-3U; [0116]); and the second generation unit (host region identification module 110) (Fig. 1; [0076-0077]) configured to generate the learning data (machine learning system for generating training set) ([0077]), the learning data including the label (the learning data including a location for the second image; such as a bounding box) (Figs. 3C, 3N, 3O, and 3S; [0077]), the label indicating the object region including a region corresponding to the closed region in the synthesized image (the label indicating a region for the closed region for where the second image will be placed to generate the synthesized image) (Figs. 3C, 3N, 3O, and 3S; [0077], [0117], and [0119-0120]), wherein the second generation unit (host region identification module 110) (Fig. 1; [0076-0077]) generates the learning data (machine learning system for generating training set) ([0077]) including the label (the learning data including a location for the second image; such as a bounding box) (Figs. 3C, 3N, 3O, and 3S; [0077]), and the texture label (acquiring texture of the source digital content so as to be able to recreate the luminance (and thus the texture and luminance changes) of the target digital content in the source digital content) ([0088] and [0344]) indicating a region having the texture in the closed region in the synthesized image (wherein the texture label indicates the texture within the closed region (bounding box) for the synthesized image) ([0088], [0095-0096], [0098], [0248], and [0344-0348]). However, Marino does not explicitly teach that the “synthesized image” is used within the learning data. Quinton teaches systems and methods for generating composite based data for use in machine learning systems (Abstract); wherein generating composite data comprising the desired label of a response entry and image data corresponding to the fragment of the composite image (Abstract); and wherein composite images can be used as composite training data ([0067]). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Marino to include the synthesized/composite image as part of the learning data since it improves the efficiency of generating training data and the training process (Quinton; [0011] and [0058]). Claim(s) 11 is rejected under 35 U.S.C. 103 as being unpatentable over Marino et al., US 2017/0278289 A1 (Marino), Quinton et al., US 2022/0327811 A1 (Quinton), and further in view of Liu et al., US 2021/0279841 A1 (Liu). Regarding claim 11, Marino teaches comprising an identification unit configured to identify whether an input image is a texture image generated by third generation unit or an actually captured texture image (the host region identification module can include, for example, one or more machine learning-based classifiers, such as a random forest classifier, that are configured to determine whether a texture of a region in a target digital content is sufficiently bland and/or uniform so that the region could be classified as a host region) ([0076]), wherein the acquisition unit acquires the texture image as the second image (acquiring texture of the source digital content so as to be able to recreate the luminance (and thus the texture and luminance changes) of the target digital content in the source digital content) ([0088] and [0344]), and the acquisition unit acquires a texture image generated by a learned generation unit as the second image (acquiring texture of the source digital content so as to be able to recreate the luminance (and thus the texture and luminance changes) of the target digital content in the source digital content) ([0088] and [0344]) such that the texture image generated according to a random number or a random number vector is identified as being the actually captured texture image by the identification unit (wherein the texture of the pixels is done based on predicting a label for each pixel based on a conditional random field) ([0193-0198]). Quinton teaches determining textures from artificial images and/or real images ([0076]). However, neither explicitly teaches “using a generative adversarial network that generates the texture image”. Liu teaches a discriminator that discriminates between different data instances, such as categorizing an input data item as true or false ([0070]); wherein the discriminator takes as input a synthesized image generated by a generator as well as a random sample of image training data used to generate said synthesized image that is a subset of said synthesized image data ([0070]); and wherein using a generative adversarial network that generates the texture image (wherein a generative adversarial network (GAN) is used to make sure the image conforms to a specific style or texture) ([0071] and [0079]). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of prior arts to include using a GAN for the texture since using a GAN can improve generator (generating a specific output or classification) operation (Liu; [0063]) and the training is iteratively improved for the generator and discriminator (Liu; [0074]). Conclusion The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Jogo, US 2001/0048447 A1 teaches: according to an aspect of the present invention, a method of cropping and synthesizing an image on a screen comprises the steps of displaying a crop boundary with a reference point on an image to synthesize on the screen, upon selecting a template having at least a frame, the crop boundary having a corresponding shape to that of the frame of the selected template and being variable in size while keeping the same shape and being centered on the reference point; moving the crop boundary on the screen through an operation device, to position the reference point of the crop boundary on an appropriate point of the image to synthesize; thereafter enlarging or reducing the crop boundary about the reference point, to bound an appropriate area of the image to synthesize; cropping an image of the bounded area; and pasting the cropped image in the frame of the template after enlarging or reducing the cropped image in accordance with the size of the frame of the template ([0008]). Contact Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHAEL J VANCHY JR whose telephone number is (571)270-1193. The examiner can normally be reached Monday - Friday 9am - 5pm. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Emily Terrell can be reached at (571) 270-3717. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /MICHAEL J VANCHY JR/Primary Examiner, Art Unit 2666 Michael.Vanchy@uspto.gov
Read full office action

Prosecution Timeline

Jan 20, 2023
Application Filed
Mar 31, 2026
Non-Final Rejection mailed — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12633151
ANOMALOUS EVENT PREDICTION USING CONTRASTIVE LEARNING
4y 2m to grant Granted May 19, 2026
Patent 12633130
METHOD, PROCESSOR CIRCUIT AND COMPUTER-READABLE STORAGE MEDIUM FOR PEDESTRIAN DETECTION BY A PROCESSOR CIRCUIT OF A MOTOR VEHICLE
2y 11m to grant Granted May 19, 2026
Patent 12626531
Systems, Methods and Media for Deep Shape Prediction
3y 3m to grant Granted May 12, 2026
Patent 12614386
METHOD OF PROCESSING VIDEO, METHOD OF QUERING VIDEO, AND METHOD OF TRAINING MODEL
3y 4m to grant Granted Apr 28, 2026
Patent 12602906
IMAGE RECOGNITION APPARATUS
3y 2m to grant Granted Apr 14, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

1-2
Expected OA Rounds
67%
Grant Probability
87%
With Interview (+20.1%)
3y 3m (~0m remaining)
Median Time to Grant
Low
PTA Risk
Based on 608 resolved cases by this examiner. Grant probability derived from career allowance rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month