DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on March 16, 2026 has been entered. Accordingly, Claims 1-20 are pending in this application. Claims 1, 11, 13-15, and 17 have been amended. Claims 1, 11, and 17 are independent claims.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-3, 6, 7, 11, 13, 16, 17, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Vasireddy (PG Pub. No. 2020/0334270 A1) and further in view of Makumbi (PG Pub. No. 2022/0222266 A1) and Gottlieb (PG Pub. No. 2017/0236157 A1).
Regarding Claim 1, Vasireddy discloses a computer-implemented method comprising:
receiving a scan request from a client device to scan data (see Vasireddy, paragraph [0147], where Fig. 12 shows a system for running warehouse loads for multiple tenants of a data warehouse, in accordance with an embodiment);
assigning, in response to the scan request, one or more partitions of an intermediate shared processing queue to scan website data, wherein a first partition of the one or more partitions is assigned a first subset of data and a second partition of the one or more partitions is assigned a second subset of data (see Vasireddy, paragraph [0107], where as illustrated in Fig. 4, which illustrates the operation of the system with a plurality of tenants (customers) in accordance with an embodiment, data can be sourced, e.g., from each of a plurality of customer’s (tenant’s) enterprise software application or data environment, using the data pipeline process as described above, and loaded into a data warehouse instance; see also paragraph [0170], where by having a cluster of data integrator agents, when a system submits multiple ETL runs for multiple tenants in parallel, the data integrator can service requests appropriately by increasing the number of agents to a repository when the system needs to scale out);
scanning, by the first partition, the first subset of data to extract first data (see Vasireddy, paragraph [0107], where as illustrated in Fig. 4, which illustrates the operation of the system with a plurality of tenants (customers) in accordance with an embodiment, data can be sourced, e.g., from each of a plurality of customer’s (tenant’s) enterprise software application or data environment, using the data pipeline process as described above, and loaded into a data warehouse instance; see also paragraph [0170], where by having a cluster of data integrator agents, when a system submits multiple ETL runs for multiple tenants in parallel, the data integrator can service requests appropriately by increasing the number of agents to a repository when the system needs to scale out); and
scanning, by the second partition, the second subset of data to extract second website data in parallel with publishing the first data (see Vasireddy, paragraph [0107], where as illustrated in Fig. 4, which illustrates the operation of the system with a plurality of tenants (customers) in accordance with an embodiment, data can be sourced, e.g., from each of a plurality of customer’s (tenant’s) enterprise software application or data environment, using the data pipeline process as described above, and loaded into a data warehouse instance; see also paragraph [0170], where by having a cluster of data integrator agents, when a system submits multiple ETL runs for multiple tenants in parallel, the data integrator can service requests appropriately by increasing the number of agents to a repository when the system needs to scale out … depending on the concurrent load that the platform is sized for, the ETL server can have many agents running [it is the position of the Examiner that concurrent loads constitute one ETL job for one client operating in the load phase while a second ETL job for a second client is operating in the extract phase]).
Vasireddy does not disclose:
sending, to the client device, status information indicating a first progress of scanning of the first subset of websites; and
publishing, based on the status information indicating that the first progress of scanning is completed, the first website data to a tenant database associated with the client device; and
the first and second data are website data associated with a set of websites.
Vasireddy in view of Makumbi discloses:
sending, to the client device, status information indicating a first progress of scanning of the first subset of websites (see Makumbi, paragraph [0003], where a method for monitoring extract includes obtaining, by a monitoring device, information related to one or more ETL jobs scheduled in an ETL system; generating, by the monitoring device, ETL job metrics that include status information, timing information, and data volume information associated with one or more constituent tasks associated with one or more ETL jobs scheduled in the ETL system, wherein the ETL job metrics include metrics related to one or more of extracting data records from a data source; see also paragraph [0004], where status information includes … execution time, a start time, or completion time for an ETL job or a constituent task associated with an ETL job); and
publishing, based on the status information indicating that the first progress of scanning is completed (see Vasireddy, paragraph [0081], where when the extract process has completed its extraction, the data transformation layer can be used to begin the transform process, to transform the extracted data into a model format to be loaded into the customer schema of the data warehouse), the first website data to a tenant database associated with the client device (see Vasireddy, paragraph [0107], where as illustrated in Fig. 4, which illustrates the operation of the system with a plurality of tenants (customers) in accordance with an embodiment, data can be sourced, e.g., from each of a plurality of customer’s (tenant’s) enterprise software application or data environment, using the data pipeline process as described above, and loaded into a data warehouse instance; see also paragraph [0170], where by having a cluster of data integrator agents, when a system submits multiple ETL runs for multiple tenants in parallel, the data integrator can service requests appropriately by increasing the number of agents to a repository when the system needs to scale out).
Vasireddy and Makumbi are both directed toward ETL jobs. Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Vasireddy with Makumbi as it amounts to combining prior art elements according to known techniques to yield predictable results (see MPEP 2143(I)(A)).
Vasireddy does not disclose the first data and second data are website data associated with a set of websites. Gottlieb discloses the first and second data are website data associated with a set of websites (see Gottlieb, Claim 1, where the method comprises using a web crawler of the cookie harvesting computing equipment to load a publisher website while allowing the publisher website to update the obtained cookie set).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Vasireddy and Makumbi with Gottlieb for the benefit of simulating a human user with a web crawler to obtain website data such as cookies (see Gottlieb, Abstract).
Regarding Claim 2, Vasireddy in view of Makumbi and Gottlieb discloses the computer-implemented method of Claim 1, wherein assigning the one or more partitions further comprises:
Vasireddy does not disclose determining the one or more partitions to assign based on a number of websites associated with the scan request. Vasireddy in view of Gottlieb discloses determining the one or more partitions to assign based on a number (see Vasireddy, paragraph [0116] – [0118], where compute instances/services (virtual machines) which execute the ETL process for various customers … based on a determination of historical performance data recorded over a period of time, the system can optimize the execution of activation plans, e.g., for one or more functional areas associated with a particular tenant, or across a sequence of activation plans associated with multiple tenants, to address utilization of the VMs and service level agreements (SLAs) for those tenants; such historical data can include statistics of load volumes and load times; for example, the historical data can include size of extraction, count of extraction, extraction time, size of warehouse … warehouse table count) of websites associated with the scan request (see Gottlieb, Claim 1, where the method comprises using a web crawler of the cookie harvesting computing equipment to load a publisher website while allowing the publisher website to update the obtained cookie set).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Vasireddy with Gottlieb for the benefit of simulating a human user with a web crawler to obtain website data such as cookies (see Gottlieb, Abstract).
Regarding Claim 3, Vasireddy in view of Makumbi and Gottlieb discloses the computer-implemented method of Claim 1, further comprises assigning an additional one or more partitions in response to receiving an additional scan request from the client device (see Vasireddy, paragraph [0118], where historical data can be used to estimate and plan current and future activation plans in order to organize various tasks to, such as, for example, run in sequence or in parallel to arrive at a minimum time to run an activation plan; in addition, the gathered historical data can be used to optimize across multiple activation plans for a tenant).
Regarding Claim 6, Vasireddy in view of Makumbi and Gottlieb discloses the computer-implemented method of Claim 1, wherein scanning the first subset of websites to extract the first website data further comprises at least one of:
replicating the first data;
analyzing the first data; or
categorizing the first data (see Vasireddy, paragraph [0085], [0086], where the data transformation layer can transform extracted data into a format suitable for loading into a customer schema of data warehouse … the data transformation can perform dimension generation … for example, dimensions can include categories of data such as, for example ‘name’, ‘address’, or ‘age’).
Vasireddy does not disclose the data is website data associated with a set of websites. Gottlieb discloses the data is website data associated with a set of websites (see Gottlieb, Claim 1, where the method comprises using a web crawler of the cookie harvesting computing equipment to load a publisher website while allowing the publisher website to update the obtained cookie set).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Vasireddy with Gottlieb for the benefit of simulating a human user with a web crawler to obtain website data such as cookies (see Gottlieb, Abstract).
Regarding Claim 7, Vasireddy in view of Makumbi and Gottlieb discloses the computer-implemented method of Claim 1, wherein scanning the first subset of websites to extract the first website data further comprises:
Vasireddy does not disclose extracting, from the first subset of websites, at least one of: cookies, tags, forms, or storages. Gottlieb discloses extracting, from the first set of websites, at least one of: cookies (see Gottlieb, Claim 1, where the method comprises using a web crawler of the cookie harvesting computing equipment to load a publisher website while allowing the publisher website to update the obtained cookie set), tags, forms, or storages.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Vasireddy with Gottlieb for the benefit of simulating a human user with a web crawler to obtain website data such as cookies (see Gottlieb, Abstract).
Regarding Claim 11, Vasireddy discloses a system comprising:
one or more non-transitory computer readable media (see Vasireddy, paragraph [0184], where teachings herein can include a computer program product which is a non-transitory computer readable storage medium); and
processing hardware (see Vasireddy, paragraph [0183], where teachings herein may be conveniently implemented using one or more conventional general purpose or specialized computer, computing device, machine, or microprocessor) configured to cause the system to:
receiving a scan request from a client device to scan data (see Vasireddy, paragraph [0147], where Fig. 12 shows a system for running warehouse loads for multiple tenants of a data warehouse, in accordance with an embodiment);
assigning, in response to the scan request, one or more partitions of an intermediate shared processing queue to scan website data, wherein a first partition of the one or more partitions is assigned a first subset of data and a second partition of the one or more partitions is assigned a second subset of data (see Vasireddy, paragraph [0107], where as illustrated in Fig. 4, which illustrates the operation of the system with a plurality of tenants (customers) in accordance with an embodiment, data can be sourced, e.g., from each of a plurality of customer’s (tenant’s) enterprise software application or data environment, using the data pipeline process as described above, and loaded into a data warehouse instance; see also paragraph [0170], where by having a cluster of data integrator agents, when a system submits multiple ETL runs for multiple tenants in parallel, the data integrator can service requests appropriately by increasing the number of agents to a repository when the system needs to scale out);
scanning, by the first partition, the first subset of data to extract first data (see Vasireddy, paragraph [0107], where as illustrated in Fig. 4, which illustrates the operation of the system with a plurality of tenants (customers) in accordance with an embodiment, data can be sourced, e.g., from each of a plurality of customer’s (tenant’s) enterprise software application or data environment, using the data pipeline process as described above, and loaded into a data warehouse instance; see also paragraph [0170], where by having a cluster of data integrator agents, when a system submits multiple ETL runs for multiple tenants in parallel, the data integrator can service requests appropriately by increasing the number of agents to a repository when the system needs to scale out); and
scanning, by the second partition, the second subset of data to extract second website data in parallel with publishing the first data (see Vasireddy, paragraph [0107], where as illustrated in Fig. 4, which illustrates the operation of the system with a plurality of tenants (customers) in accordance with an embodiment, data can be sourced, e.g., from each of a plurality of customer’s (tenant’s) enterprise software application or data environment, using the data pipeline process as described above, and loaded into a data warehouse instance; see also paragraph [0170], where by having a cluster of data integrator agents, when a system submits multiple ETL runs for multiple tenants in parallel, the data integrator can service requests appropriately by increasing the number of agents to a repository when the system needs to scale out … depending on the concurrent load that the platform is sized for, the ETL server can have many agents running [it is the position of the Examiner that concurrent loads constitute one ETL job for one client operating in the load phase while a second ETL job for a second client is operating in the extract phase]).
Vasireddy does not disclose:
sending, to the client device, status information indicating a first progress of scanning of the first subset of websites; and
publishing, based on the status information indicating that the first progress of scanning is completed, the first website data to a tenant database associated with the client device; and
the first and second data are website data associated with a set of websites.
Vasireddy in view of Makumbi discloses:
sending, to the client device, status information indicating a first progress of scanning of the first subset of websites (see Makumbi, paragraph [0003], where a method for monitoring extract includes obtaining, by a monitoring device, information related to one or more ETL jobs scheduled in an ETL system; generating, by the monitoring device, ETL job metrics that include status information, timing information, and data volume information associated with one or more constituent tasks associated with one or more ETL jobs scheduled in the ETL system, wherein the ETL job metrics include metrics related to one or more of extracting data records from a data source; see also paragraph [0004], where status information includes … execution time, a start time, or completion time for an ETL job or a constituent task associated with an ETL job); and
publishing, based on the status information indicating that the first progress of scanning is completed (see Vasireddy, paragraph [0081], where when the extract process has completed its extraction, the data transformation layer can be used to begin the transform process, to transform the extracted data into a model format to be loaded into the customer schema of the data warehouse), the first website data to a tenant database associated with the client device (see Vasireddy, paragraph [0107], where as illustrated in Fig. 4, which illustrates the operation of the system with a plurality of tenants (customers) in accordance with an embodiment, data can be sourced, e.g., from each of a plurality of customer’s (tenant’s) enterprise software application or data environment, using the data pipeline process as described above, and loaded into a data warehouse instance; see also paragraph [0170], where by having a cluster of data integrator agents, when a system submits multiple ETL runs for multiple tenants in parallel, the data integrator can service requests appropriately by increasing the number of agents to a repository when the system needs to scale out).
Vasireddy and Makumbi are both directed toward ETL jobs. Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Vasireddy with Makumbi as it amounts to combining prior art elements according to known techniques to yield predictable results (see MPEP 2143(I)(A)).
Vasireddy does not disclose the first data and second data are website data associated with a set of websites. Gottlieb discloses the first and second data are website data associated with a set of websites (see Gottlieb, Claim 1, where the method comprises using a web crawler of the cookie harvesting computing equipment to load a publisher website while allowing the publisher website to update the obtained cookie set).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Vasireddy and Makumbi with Gottlieb for the benefit of simulating a human user with a web crawler to obtain website data such as cookies (see Gottlieb, Abstract).
Regarding Claim 13, Vasireddy in view of Makumbi and Gottlieb discloses the system of Claim 11, wherein the processing hardware is configured to cause the system to:
scan the first subset of data to extract the first data by categorizing the first data (see Vasireddy, paragraph [0085], [0086], where the data transformation layer can transform extracted data into a format suitable for loading into a customer schema of data warehouse … the data transformation can perform dimension generation … for example, dimensions can include categories of data such as, for example ‘name’, ‘address’, or ‘age’).
Vasireddy does not disclose the data is website data associated with a set of websites. Gottlieb discloses the data is website data associated with a set of websites (see Gottlieb, Claim 1, where the method comprises using a web crawler of the cookie harvesting computing equipment to load a publisher website while allowing the publisher website to update the obtained cookie set).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Vasireddy with Gottlieb for the benefit of simulating a human user with a web crawler to obtain website data such as cookies (see Gottlieb, Abstract).
Regarding Claim 16, Vasireddy in view of Makumbi and Gottlieb discloses the system of Claim 11, wherein the processing hardware is configured to cause the system to:
Vasireddy does not disclose:
receive the scan request originating from the client device in a first region from one or more servers of a second region; and
replicate, utilizing an intermediate replicator, the first website data within the first region to the second region to be compatible with the first region.
Gottlieb discloses:
receive the scan request originating from the client device in a first region from one or more servers of a second region (see Gottlieb, paragraphs [0141], [0142], where system 10 may be used to generate and maintain user profile cookie sets for user profiles that are specific to selected geographical locations; system 10 may generate and maintain a user profile cookie set for a user profile that is specific to a particular geographical location by loading websites (as described above in connection with step 130 of FIG. 10 or step 160 of FIG. 12) using computing equipment (e.g., proxy servers or other computing equipment) that is located in that geographical location. However, this is merely illustrative; if desired, system 10 may generate and maintain a user profile cookie set for a user profile that is specific to a selected geographical location by selecting sets of websites to browse, at least in part, by the percentage of users of that website that are located in that particular geographical region; in this way, cookies sets may be generated that indicate to a publisher website that a simulated user lives in, primarily browses the internet from, or primarily browses websites associated with a particular geographical region); and
replicate, utilizing an intermediate replicator, the first website data within the first region to the second region to be compatible with the first region (see Gottlieb, paragraphs [0141], [0142], where system 10 may be used to generate and maintain user profile cookie sets for user profiles that are specific to selected geographical locations; system 10 may generate and maintain a user profile cookie set for a user profile that is specific to a particular geographical location by loading websites (as described above in connection with step 130 of FIG. 10 or step 160 of FIG. 12) using computing equipment (e.g., proxy servers or other computing equipment) that is located in that geographical location. However, this is merely illustrative; if desired, system 10 may generate and maintain a user profile cookie set for a user profile that is specific to a selected geographical location by selecting sets of websites to browse, at least in part, by the percentage of users of that website that are located in that particular geographical region; in this way, cookies sets may be generated that indicate to a publisher website that a simulated user lives in, primarily browses the internet from, or primarily browses websites associated with a particular geographical region).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Vasireddy with Gottlieb for the benefit of simulating a human user with a web crawler (see Gottlieb, paragraph [0006]).
Regarding Claim 17, Vasireddy discloses a non-transitory computer-readable medium storing executable instructions which, when executed by a processing device, cause the processing device to perform operations comprising:
receiving a scan request from a client device to scan data (see Vasireddy, paragraph [0147], where Fig. 12 shows a system for running warehouse loads for multiple tenants of a data warehouse, in accordance with an embodiment);
assigning, in response to the scan request, one or more partitions of an intermediate shared processing queue to scan website data, wherein a first partition of the one or more partitions is assigned a first subset of data and a second partition of the one or more partitions is assigned a second subset of data (see Vasireddy, paragraph [0107], where as illustrated in Fig. 4, which illustrates the operation of the system with a plurality of tenants (customers) in accordance with an embodiment, data can be sourced, e.g., from each of a plurality of customer’s (tenant’s) enterprise software application or data environment, using the data pipeline process as described above, and loaded into a data warehouse instance; see also paragraph [0170], where by having a cluster of data integrator agents, when a system submits multiple ETL runs for multiple tenants in parallel, the data integrator can service requests appropriately by increasing the number of agents to a repository when the system needs to scale out);
scanning, by the first partition, the first subset of data to extract first data (see Vasireddy, paragraph [0107], where as illustrated in Fig. 4, which illustrates the operation of the system with a plurality of tenants (customers) in accordance with an embodiment, data can be sourced, e.g., from each of a plurality of customer’s (tenant’s) enterprise software application or data environment, using the data pipeline process as described above, and loaded into a data warehouse instance; see also paragraph [0170], where by having a cluster of data integrator agents, when a system submits multiple ETL runs for multiple tenants in parallel, the data integrator can service requests appropriately by increasing the number of agents to a repository when the system needs to scale out); and
scanning, by the second partition, the second subset of data to extract second website data in parallel with publishing the first data (see Vasireddy, paragraph [0107], where as illustrated in Fig. 4, which illustrates the operation of the system with a plurality of tenants (customers) in accordance with an embodiment, data can be sourced, e.g., from each of a plurality of customer’s (tenant’s) enterprise software application or data environment, using the data pipeline process as described above, and loaded into a data warehouse instance; see also paragraph [0170], where by having a cluster of data integrator agents, when a system submits multiple ETL runs for multiple tenants in parallel, the data integrator can service requests appropriately by increasing the number of agents to a repository when the system needs to scale out … depending on the concurrent load that the platform is sized for, the ETL server can have many agents running [it is the position of the Examiner that concurrent loads constitute one ETL job for one client operating in the load phase while a second ETL job for a second client is operating in the extract phase]).
Vasireddy does not disclose:
sending, to the client device, status information indicating a first progress of scanning of the first subset of websites; and
publishing, based on the status information indicating that the first progress of scanning is completed, the first website data to a tenant database associated with the client device; and
the first and second data are website data associated with a set of websites.
Vasireddy in view of Makumbi discloses:
sending, to the client device, status information indicating a first progress of scanning of the first subset of websites (see Makumbi, paragraph [0003], where a method for monitoring extract includes obtaining, by a monitoring device, information related to one or more ETL jobs scheduled in an ETL system; generating, by the monitoring device, ETL job metrics that include status information, timing information, and data volume information associated with one or more constituent tasks associated with one or more ETL jobs scheduled in the ETL system, wherein the ETL job metrics include metrics related to one or more of extracting data records from a data source; see also paragraph [0004], where status information includes … execution time, a start time, or completion time for an ETL job or a constituent task associated with an ETL job); and
publishing, based on the status information indicating that the first progress of scanning is completed (see Vasireddy, paragraph [0081], where when the extract process has completed its extraction, the data transformation layer can be used to begin the transform process, to transform the extracted data into a model format to be loaded into the customer schema of the data warehouse), the first website data to a tenant database associated with the client device (see Vasireddy, paragraph [0107], where as illustrated in Fig. 4, which illustrates the operation of the system with a plurality of tenants (customers) in accordance with an embodiment, data can be sourced, e.g., from each of a plurality of customer’s (tenant’s) enterprise software application or data environment, using the data pipeline process as described above, and loaded into a data warehouse instance; see also paragraph [0170], where by having a cluster of data integrator agents, when a system submits multiple ETL runs for multiple tenants in parallel, the data integrator can service requests appropriately by increasing the number of agents to a repository when the system needs to scale out).
Vasireddy and Makumbi are both directed toward ETL jobs. Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Vasireddy with Makumbi as it amounts to combining prior art elements according to known techniques to yield predictable results (see MPEP 2143(I)(A)).
Vasireddy does not disclose the first data and second data are website data associated with a set of websites. Gottlieb discloses the first and second data are website data associated with a set of websites (see Gottlieb, Claim 1, where the method comprises using a web crawler of the cookie harvesting computing equipment to load a publisher website while allowing the publisher website to update the obtained cookie set).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Vasireddy and Makumbi with Gottlieb for the benefit of simulating a human user with a web crawler to obtain website data such as cookies (see Gottlieb, Abstract).
Regarding Claim 19, Vasireddy in view of Makumbi and Gottlieb discloses the non-transitory computer-readable medium of Claim 17, further comprises assigning an additional one or more partitions in response to receiving an additional scan request from the client device (see Vasireddy, paragraph [0118], where historical data can be used to estimate and plan current and future activation plans in order to organize various tasks to, such as, for example, run in sequence or in parallel to arrive at a minimum time to run an activation plan; in addition, the gathered historical data can be used to optimize across multiple activation plans for a tenant).
Claim 4, 12 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Vasireddy, Makumbi, and Gottlieb as applied to Claims 1-3, 6, 7, 11, 13, 16, 17, and 19 above, and further in view of Yin (US Patent No. 11,281,625 B1) and Banerjee (PG Pub. No. 2016/0246845 A1).
Regarding Claim 4, Vasireddy in view of Makumbi and Gottlieb discloses the computer-implemented method of Claim 1, wherein assigning the one or more partitions comprises:
Vasireddy does not disclose:
determining, for the scan request, a classification comprising one of a user-initiated scan request, a scheduled request, or a priority request; and
assigning the one or more partitions to the set of websites based on the classification of the scan request.
Vasireddy in view of Gottlieb and Yin discloses assigning the one or more partitions to the set of websites (see Gottlieb, Claim 1, where the method comprises using a web crawler of the cookie harvesting computing equipment to load a publisher website while allowing the publisher website to update the obtained cookie set) based on the classification of the scan request (see Yin, Claim 15, wherein the instructions that cause the computer system to provision the computing resources sufficient to process the query further include instructions that cause the computer system to provision the computing resources sufficient to process the query based at least in part on a set of query types wherein each query type of the set of query types has a corresponding set of resources sufficient to process a query of the query type).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Vasireddy with Gottlieb for the benefit of simulating a human user with a web crawler (see Gottlieb, paragraph [0006]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Vasireddy and Gottlieb with Yin for the benefit of provisioning sufficient resources for a query based on query type (see Yin, Claim 15).
Vasireddy in view of Gottlieb and Yin does not disclose determining, for the scan request, a classification comprising one of a user-initiated scan request, a scheduled request, or a priority request. Banerjee discloses determining, for the scan request, a classification comprising one of a user-initiated scan request, a scheduled request, or a priority request (see Banerjee, paragraph [0021], where the query may be one of a registered query and an ad-hoc query; it may be understood that, the registered query is the query that is registered in the system and executed on data upon expiration of each pre-defined time interval; the ad-hoc query, on the other hand, may be received from the user in real-time via a user interface).
Vasireddy and Yin are both directed to provisioning resources for a query based on query type. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Vasireddy, Gottlieb, and Yin with Banerjee as Banerjee uses a known technique to improve Yin in the same way (see MPEP 2141(III)(C)).
Regarding Claim 12, Vasireddy in view of Makumbi and Gottlieb discloses the system of Claim 11, wherein the processing hardware is configured to cause the system to assign the one or more partitions by:
Vasireddy does not disclose:
identifying a number of websites associated with the scan request;
determining, for the scan request, a classification comprising one of a user-initiated scan request, a scheduled request, or a priority request; and
determining the one or more partitions from a plurality of partitions of the intermediate shared processing queue to assign based on the number of websites and the classification of the scan request.
Vasireddy in view of Gottlieb discloses:
identifying a number of websites associated with the scan request (see Gottlieb, Claim, 34, wherein the method includes selecting, from the plurality of publisher websites, a set of publisher websites to browse).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Vasireddy with Gottlieb for the benefit of simulating a human user with a web crawler (see Gottlieb, paragraph [0006]).
Vasireddy in view of Gottlieb does not disclose:
determining, for the scan request, a classification comprising one of a user-initiated scan request, a scheduled request, or a priority request; and
assigning the one or more partitions based on the number of websites and the classification of the scan request.
Yin discloses assigning the one or more partitions based on the number of websites and the classification of the scan request (see Yin, Claim 15, wherein the instructions that cause the computer system to provision the computing resources sufficient to process the query further include instructions that cause the computer system to provision the computing resources sufficient to process the query based at least in part on a set of query types wherein each query type of the set of query types has a corresponding set of resources sufficient to process a query of the query type).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Vasireddy and Gottlieb with Yin for the benefit of provisioning sufficient resources for a query based on query type (see Yin, Claim 15).
Vasireddy in view of Gottlieb and Yin does not disclose determining, for the scan request, a classification comprising one of a user-initiated scan request, a scheduled request, or a priority request. Banerjee discloses determining, for the scan request, a classification comprising one of a user-initiated scan request, a scheduled request, or a priority request (see Banerjee, paragraph [0021], where the query may be one of a registered query and an ad-hoc query; it may be understood that, the registered query is the query that is registered in the system and executed on data upon expiration of each pre-defined time interval; the ad-hoc query, on the other hand, may be received from the user in real-time via a user interface).
Banerjee and Yin are both directed to provisioning resources for a query based on query type. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Vasireddy, Gottlieb and Yin with Banerjee as Banerjee uses a known technique to improve Yin in the same way (see MPEP 2141(III)(C)).
Regarding Claim 18, Vasireddy in view of Makumbi and Gottlieb discloses the non-transitory computer-readable medium of Claim 17, wherein assigning the one or more partitions further comprises:
Vasireddy does not disclose:
identifying a number of websites associated with the scan request;
determining, for the scan request, a classification comprising one of a user-initiated scan request, a scheduled request, or a priority request; and
assigning the one or more partitions to the set of websites based on the classification of the scan request and the number of websites.
Vasireddy in view of Gottlieb discloses:
identifying a number of websites associated with the scan request (see Gottlieb, Claim, 34, wherein the method includes selecting, from the plurality of publisher websites, a set of publisher websites to browse).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Vasireddy with Gottlieb for the benefit of simulating a human user with a web crawler (see Gottlieb, paragraph [0006]).
Vasireddy in view of Gottlieb does not disclose:
determining, for the scan request, a classification comprising one of a user-initiated scan request, a scheduled request, or a priority request; and
assigning the one or more partitions to the set of websites based on the classification of the scan request and the number of websites.
Yin discloses:
assigning the one or more partitions to the set of websites based on the classification of the scan request (see Yin, Claim 15, wherein the instructions that cause the computer system to provision the computing resources sufficient to process the query further include instructions that cause the computer system to provision the computing resources sufficient to process the query based at least in part on a set of query types wherein each query type of the set of query types has a corresponding set of resources sufficient to process a query of the query type) and the number of websites (see Vasireddy, paragraph [0116] – [0118], where compute instances/services (virtual machines) which execute the ETL process for various customers … based on a determination of historical performance data recorded over a period of time, the system can optimize the execution of activation plans, e.g., for one or more functional areas associated with a particular tenant, or across a sequence of activation plans associated with multiple tenants, to address utilization of the VMs and service level agreements (SLAs) for those tenants; such historical data can include statistics of load volumes and load times; for example, the historical data can include size of extraction, count of extraction, extraction time, size of warehouse … warehouse table count).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Vasireddy and Gottlieb with Yin for the benefit of provisioning sufficient resources for a query based on query type (see Yin, Claim 15).
Vasireddy in view of Gottlieb and Yin does not disclose determining, for the scan request, a classification comprising one of a user-initiated scan request, a scheduled request, or a priority request. Banerjee discloses determining, for the scan request, a classification comprising one of a user-initiated scan request, a scheduled request, or a priority request (see Banerjee, paragraph [0021], where the query may be one of a registered query and an ad-hoc query; it may be understood that, the registered query is the query that is registered in the system and executed on data upon expiration of each pre-defined time interval; the ad-hoc query, on the other hand, may be received from the user in real-time via a user interface).
Banerjee and Yin are both directed to provisioning resources for a query based on query type. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Vasireddy, Gottlieb, and Yin with Banerjee as Banerjee uses a known technique to improve Yin in the same way (see MPEP 2141(III)(C)).
Claim 5 is rejected under 35 U.S.C. 103 as being unpatentable over Vasireddy, Makumbi, and Gottlieb as applied to Claims 1-3, 6, 7, 11, 13, 16, 17, and 19 above and further in view of Stanfill (PG Pub. No. 2011/0153662 A1).
Regarding Claim 5, Vasireddy in view of Makumbi and Gottlieb discloses the computer-implemented method of Claim 1, further comprises:
Vasireddy does not disclose:
receiving a first additional scan request that comprises a user-initiated scan request;
receiving a second additional scan request that comprises a priority request;
extracting data of the second additional scan request prior to extracting data of the first additional scan request.
Stanfill discloses:
receiving a first additional scan request that comprises a user-initiated scan request (see Stanfill, Fig. 7C, where high priority query A completes before lower priority queries B and C);
receiving a second additional scan request that comprises a priority request (see Stanfill, Fig. 7C, where high priority query A completes before lower priority queries B and C);
extracting data of the second additional scan request prior to extracting data of the first additional scan request (see Stanfill, paragraph [0045], where referring to FIG. 7C, in some circumstances a query A may be provided a priority high enough that processing of other queries B and C is suspended until the query A completes execution).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Vasireddy with Stanfill for the benefit of query execution management by query priority (see Stanfill, Abstract, paragraph [0045]).
Claims 8, 9, 14, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Vasireddy, Makumbi, and Gottlieb as applied to Claims 1-3, 6, 7, 11, 13, 16, 17, and 19 above, and further in view of Yastrebenetsky (PG Pub. No. 2021/0382949 A1).
Regarding Claim 8, Vasireddy in view of Makumbi and Gottlieb discloses the computer-implemented method of Claim 1, wherein publishing the first website data to the tenant database further comprises:
Vasireddy does not disclose:
generating one or more messages based on scanning the first website data from the first subset of websites corresponding to one of cookies, tags, forms, or storages; and
providing, to the client device, the one or more messages.
Yastrebenetsky discloses:
generating one or more messages based on scanning the first website data from the first subset of websites corresponding to one of cookies, tags, forms, or storages (see Yastrebenetsky, Claim 5, wherein the web content inspection report includes, for each of the plurality of browser storage events, a set of cookie data that describes a cookie name, a cookie value, a cookie set method, a cookie setting tag name, a cookie setter URL, a cookie expiration date); and
providing, to the client device, the one or more messages (see Yastrebenetsky, Claim 5, wherein the web content inspection report includes, for each of the plurality of browser storage events, a set of cookie data that describes a cookie name, a cookie value, a cookie set method, a cookie setting tag name, a cookie setter URL, a cookie expiration date; see also paragraph [0062], where the system may generate (420) and display (422) a report that describes or illustrates the various detected state changes, as well as their impact, source, and other characteristics [it is the position of the Examiner that the user that initiates the system is presented the report in step 422]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Vasireddy with Yastrebenetsky for the benefit of monitoring activity of web content tags and cookies during interaction with web content (see Yastrebenetsky, Abstract).
Regarding Claim 9, Vasireddy in view of Makumbi, Gottlieb and Yastrebenetsky discloses the computer-implemented method of Claim 8, wherein generating the one or more messages further comprises:
Vasireddy does not disclose generating a message of the one or more messages that corresponds to a batch of a predetermined number of entities of extracted data. Yastrebenetsky discloses generating a message of the one or more messages that corresponds to a batch of a predetermined number of entities of extracted data (see Yastrebenetsky, Claim 5, wherein the web content inspection report includes, for each of the plurality of browser storage events, a set of cookie data that describes a cookie name, a cookie value, a cookie set method, a cookie setting tag name, a cookie setter URL, a cookie expiration date).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Vasireddy with Yastrebenetsky for the benefit of monitoring activity of web content tags and cookies during interaction with web content (see Yastrebenetsky, Abstract).
Regarding Claim 14, Vasireddy in view of Makumbi, Gottlieb and Yastrebenetsky discloses the system of Claim 11, wherein the processing hardware is configured to cause the system to:
Vasireddy does not disclose:
generate the message corresponding to a batch of extracted data that includes one or more of cookies, tags, forms, or storages, wherein the message comprises the first website data; and
providing, to the client device, the message.
Yastrebenetsky discloses:
generate the message corresponding to a batch of extracted data that includes one or more of cookies, tags, forms, or storages, wherein the message comprises the first website data (see Yastrebenetsky, Claim 5, wherein the web content inspection report includes, for each of the plurality of browser storage events, a set of cookie data that describes a cookie name, a cookie value, a cookie set method, a cookie setting tag name, a cookie setter URL, a cookie expiration date); and
providing, to the client device, the message (see Yastrebenetsky, Claim 5, wherein the web content inspection report includes, for each of the plurality of browser storage events, a set of cookie data that describes a cookie name, a cookie value, a cookie set method, a cookie setting tag name, a cookie setter URL, a cookie expiration date; see also paragraph [0062], where the system may generate (420) and display (422) a report that describes or illustrates the various detected state changes, as well as their impact, source, and other characteristics [it is the position of the Examiner that the user that initiates the system is presented the report in step 422]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Vasireddy with Yastrebenetsky for the benefit of monitoring activity of web content tags and cookies during interaction with web content (see Yastrebenetsky, Abstract).
Regarding Claim 20, Vasireddy in view of Makumbi and Gottlieb discloses the non-transitory computer-readable medium of Claim 17, wherein publishing the first website data to the tenant database further comprises:
Vasireddy does not disclose generating, for display at the client device, one or more messages based on scanning the first website data from the first subset of websites corresponding to one of cookies, tags, forms, or storages.
Yastrebenetsky discloses generating, for display at the client device, one or more messages based on scanning the first website data from the first subset of websites corresponding to one of cookies, tags, forms, or storages (see Yastrebenetsky, Claim 5, wherein the web content inspection report includes, for each of the plurality of browser storage events, a set of cookie data that describes a cookie name, a cookie value, a cookie set method, a cookie setting tag name, a cookie setter URL, a cookie expiration date; see also paragraph [0062], where the system may generate (420) and display (422) a report that describes or illustrates the various detected state changes, as well as their impact, source, and other characteristics [it is the position of the Examiner that the user that initiates the system is presented the report in step 422]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Vasireddy with Yastrebenetsky for the benefit of monitoring activity of web content tags and cookies during interaction with web content (see Yastrebenetsky, Abstract).
Claim 10 is rejected under 35 U.S.C. 103 as being unpatentable over Vasireddy, Makumbi, Gottlieb as applied to Claims 1-3, 6, 7, 11, 13, 16, 17, and 19 above, and further in view of Doutre (US Patent No. 6,470,345 B1).
Regarding Claim 10, Vasireddy in view of Makumbi and Gottlieb discloses the computer-implemented method of Claim 1, wherein publishing the first website data to the tenant database further comprises:
Vasireddy does not disclose:
determining one or more entities of the first website data exceeds a predetermined number of characters;
generating a hash value for the one or more entities that exceeds the predetermined number of characters; and
publishing to the tenant database, the hash value for the one or more entities that exceeds the predetermined number of characters.
Vasireddy in view of Gottlieb and Doutre discloses:
determining one or more entities of the first website (see Gottlieb, Claim 1, where the method comprises using a web crawler of the cookie harvesting computing equipment to load a publisher website while allowing the publisher website to update the obtained cookie set) data exceeds a predetermined number of characters (see Doutre, column 2, lines 24-32, where the number of sometimes quite lengthy strings poses a significant problem, especially when these are broken into substrings which then are constantly compared to other substrings; by parsing the strings into their semantically correct substrings and replacing those substrings with unique numeric tokens, a significant improvement is realized in the storage of the strings as well as better performance in comparing those substrings);
generating a hash value for the one or more entities that exceeds the predetermined number of characters (see Doutre, column 2, lines 24-32, where the number of sometimes quite lengthy strings poses a significant problem, especially when these are broken into substrings which then are constantly compared to other substrings; by parsing the strings into their semantically correct substrings and replacing those substrings with unique numeric tokens, a significant improvement is realized in the storage of the strings as well as better performance in comparing those substrings); and
publishing to the tenant database the hash value for the one or more entities that exceeds the predetermined number of characters (see Doutre, column 2, lines 24-32, where the number of sometimes quite lengthy strings poses a significant problem, especially when these are broken into substrings which then are constantly compared to other substrings; by parsing the strings into their semantically correct substrings and replacing those substrings with unique numeric tokens, a significant improvement is realized in the storage of the strings as well as better performance in comparing those substrings).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Vasireddy with Gottlieb for the benefit of simulating a human user with a web crawler (see Gottlieb, paragraph [0006]).and
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Vasireddy and Gottlieb with Doutre for the benefit of maintaining semantic and canonical validity of a file path and name (see Doutre, column 1, line 28-40).
Claim 15 is rejected under 35 U.S.C. 103 as being unpatentable over Vasireddy Makumbi, and Gottlieb as applied to Claims 1-3, 6, 7, 11, 13, and 16-20 above, and further in view of Zukowski (PG Pub. No. 2022/0156250 A1).
Regarding Claim 15, Vasireddy and Makumbi and Gottlieb discloses the system of Claim 11, wherein the processing hardware is configured to cause the system to:
Vasireddy does not disclose:
determine a predetermined number of scan requests for the client device and prevent at least one scan request of the predetermined number of scan requests from being assigned the one or more partitions in response to determining that an additional scan request from the client device meets or exceeds the predetermined number of scan requests. Zukowski discloses determine a predetermined number of scan requests for the client device and prevent at least one scan request of the predetermined number of scan requests from being assigned the one or more partitions in response to determining that an additional scan request from the client device meets or exceeds the predetermined number of scan requests (see Zukowski, paragraph [0066], where at operation 520, the compute service manager 108 determines whether the query is permitted based on the one or more restrictions … the compute service manager 108 can perform any one or more of the following in determining whether the query is permitted … comparing a query rate of the second user(e.g., a number of queries per day, hour or other unit of time) with a query rate limit).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Vasireddy with Zukowski for the benefit of controlling information access (see Zukowski, Abstract).
Response to Arguments
Applicant’s Arguments, filed March 16, 2026, have been fully considered, but they are not persuasive in view of the new grounds of rejection.
Conclusion
The prior art made of record and not relied upon is considered pertinent to the Applicant’s disclosure:
Avalani (PG Pub. No. 2020/0050694 A1), which concerns burst performace of database queries according to query size.
Levine (PG Pub. No. 2018/0322168 A1), which concerns techniques for asynchronous querying.
Ramanathan (PG Pub. No. 2020/0334271 A1), which concerns determining an amount of virtual machines for use with extract transform load processes.
Kalathuru (PG Pub. No. 2021/0097080 A1), which concerns a managed query service.
Koreddi (“Applied Concurrency Techniques for ETL Pipelines: Python Concurrency approaches with a case scenario”, towards data science, November 29, 2021, https://towardsdatascience.com/applied-concurrency-techniques-for-etl-pipelines-32387eb82fbc/), which contains an image explicitly disclosing an ETL pipeline containing multiple processing units executing the extract phase of one job at the same time as processing the load phase for a second job.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to FARHAD AGHARAHIMI whose telephone number is (571)272-9864. The examiner can normally be reached M-F 9am - 5pm ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Apu Mofiz can be reached at 571-272-4080. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/FARHAD AGHARAHIMI/Examiner, Art Unit 2161
/APU M MOFIZ/Supervisory Patent Examiner, Art Unit 2161