Last updated: May 29, 2026
Application No. 18/464,095
Source Code Similarity

Non-Final OA §103
Filed
Sep 08, 2023
Examiner
NGUYEN, DUY KHUONG THANH
Art Unit
2199
Tech Center
2100 — Computer Architecture & Software
Assignee
Crowdstrike Inc.
OA Round
3 (Non-Final)
Interview Optional

— +34.6% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 82% grant rate with +34.6% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.
Based on 546 resolved cases, 2023–2026
Examiner Intelligence

NGUYEN, DUY KHUONG THANH View full profile →
Grants 82% — above average
Career Allowance Rate
447 granted / 546 resolved
+26.9% vs TC avg
Strong +35% interview lift
Without
With
+34.6%
Interview Lift
resolved cases with interview
Typical timeline
2y 8m
Avg Prosecution
26 currently pending
Career history
582
Total Applications
across all art units
Statute-Specific Performance

§101
0.9%
-39.1% vs TC avg
§103
89.5%
+49.5% vs TC avg
§102
5.4%
-34.6% vs TC avg
§112
1.5%
-38.5% vs TC avg
Black line = Tech Center average estimate • Based on career data from 546 resolved cases
Office Action

§103
Notice of Pre-AIA  or AIA  Status
1.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
2.	This office action has been issued in response to amendment filed on 12/06/2025.    Claims 1-5,  11 and 15-18 have been amended.  Claims 1-20 are pending, of which claims, of which claim 1, claim 11 and claim 18 are in independent form.  
Response to Argument
3.	Applicant’s arguments with respect to claims 1-20 has been considered but are moot in view of the new ground(s) rejection. 
Status of Claims
4.	Claims 1-20 are pending, of which claims, of which claim 1, 11 and 18 are in independent form.
			Information Disclosure Statement
5.	Information disclosure statement filed on 12/06/2025 has been reviewed and considered by Examiner.
The Office's Note:
6.	The Office has cited particular paragraphs / columns and line numbers in the reference(s) applied to the claims above for the convenience of the Applicant. Although the specified citations are representative of the teachings of the art and are applied to specific limitations within the individual claim(s), other passages and figures may apply as well. It is respectfully requested from the Applicant in preparing responses, to fully consider the references in entirety as potentially teaching all or part of the claimed invention, as well as the context of the cited passages as taught by the prior art or relied upon by the Examiner.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

7.	Claim 1-5, 8 and 10-20   rejected under 35 U.S.C. 103 as being obvious over Saas (US 20160202972, herein after Saas – IDS of records - herein after Saas), in view of Langton et al. (US 10091222, herein after Langton), in view of Hill et al. (US 20160283214, herein after Hill) and further in view of Goodsitt et al. (US 12,164,867, herein after Goodsitt).


Claim 1 is rejected, Saas teaches a method, comprising: 
receiving, by a server, agent embeddings generated by a cyber security agent, the agent embeddings representing a source code file (Saas, US 20160202972, fig. 1, paragraph [0045-0053], On step 108, the characteristic and an identifier of the entity, such as the file library or the file name, may be stored in the repository.  Fig. 1 and paragraph [0061], On step 128, the user code characteristic is transmitted from the user's network to a location in which it may be compared to the repository, such as the cloud or a computing platform having access to the cloud, to a server being in communication with the repository, or the like. In some embodiments, the file or entity name, library or another identifier may also be transmitted. In some embodiments, license information extracted from the entity may also be transmitted.  Paragraph [0062], On step 132, the user code entity characteristic may be received, for example by a server. In some embodiments, the repository may be available locally to the user's computing environment, in which case transmitting and receiving the characteristics or other details may be omitted.  Para [0019], One technical problem dealt with by the disclosed subject matter is the need to detect whether a user's code comprises open source code. Open source may be provided as one or more binary libraries, compiled files, one or more file hierarchies of source code, or the like. Open source may be found in user's code as one or more binary files, one or more source files, or one or more code snippets within source files.); 
Saas does not explicitly teach
in response to the embedding dissimilarity, denying, by the server, an exfiltration of the source code file by blocking the source code file detected by the host operating system
However, Langton teaches
in response to the embedding dissimilarity, denying, by the server, an exfiltration of the source code file by blocking the source code file detected by the host operating system (Langton, US 10091222, column 6, line 25 to 54,  As shown in FIG. 4, process 400 may include receiving a file to be tested for data exfiltration (block 410). For example, security device 220 may receive a file (e.g., an executable file, an application, a program, etc.) to be tested for data exfiltration. In some implementations, the file may be associated with client device 210 (e.g., may be stored by client device 210, may be executing on client device 210, may be requested by client device 210, etc.). As an example, client device 210 may request a file (e.g., from a website, via an email link, etc.), and security device 220 may receive and/or test the file before the file is provided to client device 210. In some implementations, security device 220 may test the file in a testing environment, such as a sandbox environment.  Column 8, line 25 to 36,  As further shown in FIG. 4, process 400 may include determining whether the exfiltration information is detected in the outbound network traffic (block 450). For example, security device 220 may monitor outbound network traffic to detect whether the outbound network traffic includes the exfiltration information (e.g., a resource identifier, information designed to appear to be sensitive information, etc.). In some implementations, security device 220 may monitor outbound network traffic for plaintext that matches text of the exfiltration information (e.g., text corresponding to a resource identifier, text corresponding to sensitive information, etc.  Column 8, line 37 to 56,  As further shown in FIG. 4, if the exfiltration information is detected in the outbound network traffic (block 450—YES), then process 400 may include performing an action to counteract data exfiltration (block 460). For example, if security device 220 detects the exfiltration information in the outbound network traffic, then security device 220 may perform an action to counteract data exfiltration. In some implementations, security device 220 may counteract data exfiltration by identifying the file as suspicious. In this case, security device 220 may store a malware indicator, in association with the file, that indicates that the file is suspicious (e.g., is malware). In this way, security device 220 and/or another device may use the malware indicator to identify the file as malware, and may perform an action to counteract the malware.   Additionally, or alternatively, security device 220 may counteract data exfiltration by identifying the file (e.g., in memory) and deleting the file from memory. In this way, security device 220 may prevent the file from exfiltrating data.    Column 8, line 57 to column 9, line 5,  If security device 220 determines that the file does not exfiltrate data, then security device 220 may provide the file to client device 210. In this way, security device 220 may prevent a malicious file from exfiltrating data.).
It would have obvious to one having ordinary skill in the art before the effecting filing date of the claimed invention to combine the teachings of cited references. Thus, one of ordinary skill in the art before the effecting filing date of the claimed invention would have been motivated to incorporate Langton into Saas’s invention to determine whether a file is a data exfiltration malware application when data exfiltration occurs and/or after data exfiltration occurs thus, increasing the likelihood that data exfiltration is detected and improving security of stored information. The Communication interface permits device to receive information from another device and/or provide information to another device. The security device uses the malware indicator to identify the file as malware, and performs an action to counteract the malware. The likelihood of detecting data exfiltration increases thus, providing better information security as suggested by Langton (See abstract and summary).

Saas and Langton do not explicitly teach
receiving, by a server, agent embeddings generated by a cyber security agent monitoring a host operating system, the agent embeddings representing a source code file detected by the host operating system
However, Hill teaches
receiving, by a server, agent embeddings generated by a cyber security agent monitoring a host operating system, the agent embeddings representing a source code file detected by the host operating system(Hill, US 20160283214, para [0024-0025], agent 50 resides on computer 10. For example, the source code on computer 10 is built. Agent 50 is configured to monitor certain designated files of the built code. The built code is packaged and deployed to server 40, but agent 50 continues to reside on computer 10. Agent 50 monitors the designated files on computer 10 for modifications. If modifications are detected for designated files, agent 50 copies the modified files over network 30 from its source file path on computer 10 to the destination file path on server 40 in near-real time.  Para [0017], an agent monitors the source directory tree for changes to the code and synchronizes those changes in near-real time to the corresponding location in the destination directory tree on the end testing system. The agent is configured with a list of source directories to monitor and a corresponding location on the end testing system for each directory. When the agent detects a change in one of the source directories it is configured to monitor, the agent synchronizes the changed code to destination directories on the end system. As such, the package and deploy process is avoided during the iterative implementation and testing cycle.  Para [0023] and [0025],  Agent 50 must have access to the built code in source file system 22 on computer 10. As an example, agent 50 can reside on a development server running on the same computer 10 as the built source code. As another example, agent 50 can reside on a virtual machine on developer 2's computer 10 and access the built code from a shared directory. As a further example, agent 50 can reside on a remote computer, for example server 40, that has mounted the built code using a remote file system, for example destination file system 62. Agent 50 need not have access to the original source code on source file system 22 as long as agent 50 has access to the built code. For example, Java source files may be converted into Jar files or CoffeeScript files may be converted into JavaScript files. Agent 50 may be configured to monitor any of these files as long as agent 50 has access to the files on source file system 22.).
It would have obvious to one having ordinary skill in the art before the effecting filing date of the claimed invention to combine the teachings of cited references. Thus, one of ordinary skill in the art before the effecting filing date of the claimed invention would have been motivated to incorporate Hill into Saas and Langton’s invention to monitoring multiple files in a source file system at pre-determined time intervals. Multiple files are detected in the source file system that is modified. The modified file is a designated file is determined. The source file path of the modified file is mapped to a corresponding destination file path in a destination file system. The modified file is copied from the source file path in the source file system to the destination file path in the destination file system. as suggested by Hill (See abstract and summary).
 Saas, Langton and Hill do not explicitly teach
determining, by the server, an embedding dissimilarity between the agent embeddings generated by the cyber security agent and  reference source code embeddings representing publicly-available open source code
However, Goodsitt teaches
determining, by the server, an embedding dissimilarity between the agent embeddings generated by the cyber security agent and  reference source code embeddings representing publicly-available open source code (Goodsitt, US 12,164,867, column 11, line 5 to line 30, The one or more metrics may include a Euclidean distance, a cosine similarity, a Jaccard similarity, and/or a clustering metric, among other examples. For example, a Euclidean distance may indicate a similarity between two (or more) documents. For example, if the Euclidean distance between two embeddings is small, this may indicate that the two vectors are similar, and the documents or words they represent are likely to be related in some way. If the Euclidean distance is large, then the embeddings may be dissimilar, and the documents or words are likely to be unrelated. Cosine similarity may be a measure of similarity between two high-dimensional vectors may calculating a cosine of an angle between the between two high-dimensional vectors. For example, cosine similarity may be a measure of similarity that ranges from −1 to 1, where 1 indicates that the embeddings are identical, 0 indicates that the embeddings are orthogonal (i.e., unrelated), and −1 indicates that the embeddings are diametrically opposed. If the cosine similarity between two embeddings is close to 1, this may indicate that the two embeddings are similar, and the documents or words they represent are likely to be related in some way (e.g., resulting in a higher document similarity score). If the cosine similarity is close to 0, the embeddings may be dissimilar, and the documents or words are likely to be unrelated (e.g., resulting in a lower document similarity score).  Fig. 4 and column 17, line 47 to column 18, line 15, As further shown in FIG. 4, process 400 may include generating, based on comparing the first embedding set to the second embedding set, a code repository similarity score that indicates a similarity between the first code repository and the second code repository (block 450).)
It would have obvious to one having ordinary skill in the art before the effecting filing date of the claimed invention to combine the teachings of cited references. Thus, one of ordinary skill in the art before the effecting filing date of the claimed invention would have been motivated to incorporate Goodsitt into Saas, Langton and Hill’s invention to obtain first document set of documents associated with first code repository. The processors generate first embedding set of embeddings for respective documents included in the first document set and generate document similarity scores for the respective documents included in the first document set based on comparing the first embedding set to second embedding set of embeddings for respective documents included in a second document set of documents associated with second code repository. The first document set include one of a codebase, a code file, a configuration file, a library or a support document.as suggested by Goodsitt (See abstract and summary).
The Office notes that Goodsitt also teaches 
agent embeddings representing the source code file(Goodsitt, column 6, line 20-59, As shown by reference number 115, the comparison device may generate one or more embeddings for one or more respective documents included in the one or more code repositories. For example, the comparison device may generate one or more embeddings for one or more respective documents included in the first code repository. An embedding (also referred to as an embedding vector) may be a mapping of a discrete (e.g., categorical) variable to a vector (e.g., an embedding vector) of numbers (e.g., continuous numbers). For example, embeddings may be low dimensional, learned continuous vector representations of discrete variables. In other words, embeddings are numerical representations of objects, such as words or images, that are learned by deep learning algorithms from large amounts of data. The embeddings may be high-dimensional, meaning they consist of a large number of features. For example, a model may generate word embeddings (e.g., that enable words with similar meanings to have a similar representation in an embedding space). For example, word embeddings may enable individual words to be represented as real-valued vectors in a predefined embedding space. Each word or phrase (e.g., a set of words) may be mapped to one embedding vector, and the embedding vector values may be learned in a way that resembles how a neural network learns.)


Claim 2 is rejected for the reasons set forth hereinabove for claim 1, Saas, Langton, Hill and Goodsitt teach the method of claim 1, further comprising prompting an artificial intelligence model using the source code file (Goodsitt, column 6, line 44 to column 7, line 17, the comparison device may generate the one or more embeddings using a machine learning model. The machine learning model may be trained to generate a numerical representation of a document that captures the document's meaning and context. The machine learning model may be any machine learning model configured to generate embeddings or embedding vectors for documents and/or portions of documents (e.g., code functions, characters, strings of characters, portions of a file, or other portions of a document) associated with code repositories. For example, the machine learning model may include a “bidirectional encoder representations from transformers” (BERT) model, a Word2vec model, a “global vectors for word representation” (GloVe) model, a residual network (ResNet) model, and/or an autoencoder model, among other examples. Saas, fig. 2 and paragraph [0071-0076], Pane 216 may provide graphic representation of the results, for example a pie chart indicating the percentage of the user's entities in which open source was not used, and the percentage in which any found open source project is used.).  
Claim 3 is rejected for the reasons set forth hereinabove for claim 1, Saas, Langton, Hill and Goodsitt teach the method of claim 1, further comprising blocking operating system events associated with the source code file(Saas, paragraph [0044], Another technical effect of utilizing the disclosed subject matter is the determination of open source code presence in a user's code in a non-intrusive manner and without having to transmit the code out of the user's network, by transmitting only characteristics of the code, thus avoiding copyright infringement and security hazards, and promoting efficiency since redundant storage, communication volume and intensive comparisons are eliminated.  Paragraph [0096], copyright-protected material.  Langton, Column 8, line 37 to 56,  As further shown in FIG. 4, if the exfiltration information is detected in the outbound network traffic (block 450—YES), then process 400 may include performing an action to counteract data exfiltration (block 460). For example, if security device 220 detects the exfiltration information in the outbound network traffic, then security device 220 may perform an action to counteract data exfiltration. In some implementations, security device 220 may counteract data exfiltration by identifying the file as suspicious. In this case, security device 220 may store a malware indicator, in association with the file, that indicates that the file is suspicious (e.g., is malware). In this way, security device 220 and/or another device may use the malware indicator to identify the file as malware, and may perform an action to counteract the malware.   Additionally, or alternatively, security device 220 may counteract data exfiltration by identifying the file (e.g., in memory) and deleting the file from memory. In this way, security device 220 may prevent the file from exfiltrating data.).  
Claim 4 is rejected for the reasons set forth hereinabove for claim 1, Saas, Langton, Hill and Goodsitt teach the method of claim 1, further comprising quarantining the source code file (Saas, paragraph [0004-0006],  source code may also carry hazards. One such danger may relate to the need to trust code received from an external source. Such code may contain bugs, time or space inefficiencies, or even viruses, Trojan horses, or the like. Such threat may be overcome by using only open source provided by known and trusted origin.  Langton, Column 8, line 37 to 56,  As further shown in FIG. 4, if the exfiltration information is detected in the outbound network traffic (block 450—YES), then process 400 may include performing an action to counteract data exfiltration (block 460). For example, if security device 220 detects the exfiltration information in the outbound network traffic, then security device 220 may perform an action to counteract data exfiltration. In some implementations, security device 220 may counteract data exfiltration by identifying the file as suspicious. In this case, security device 220 may store a malware indicator, in association with the file, that indicates that the file is suspicious (e.g., is malware). In this way, security device 220 and/or another device may use the malware indicator to identify the file as malware, and may perform an action to counteract the malware.   Additionally, or alternatively, security device 220 may counteract data exfiltration by identifying the file (e.g., in memory) and deleting the file from memory. In this way, security device 220 may prevent the file from exfiltrating data.).  
Claim 5 is rejected for the reasons set forth hereinabove for claim 1, Saas, Langton, Hill and Goodsitt teach the method of claim 1, wherein in response to the determining of the embedding dissimilarity, further comprising determining the source code file represents proprietary programming (Saas, fig. 2 and paragraph [0071-0076], Pane 216 may provide graphic representation of the results, for example a pie chart indicating the percentage of the user's entities in which open source was not used, and the percentage in which any found open source project is used.  Paragraph [0096], copyright-protected material.  Goodsitt, column 11, line 5 to line 30.).  
Claim 8 is rejected for the reasons set forth hereinabove for claim 1, Saas, Langton, Hill and Goodsitt teach the method of claim 1, further comprising distributing a pre-trained machine learning model to the cyber security agent, the pre-trained machine learning model trained using the publicly-available open source code (Saas, paragraph [0046], On step 100, an open source entity is received, such as a file extracted from a known open source library, or a part thereof..  Paragraph [0047], On step 104, one or more characteristics may be determined for the entity. Exemplary characteristics may include but are not limited to: a result of a hash function such as SHA-1 function applied to the entity, for example a file, a part of the a such as the first or another specific part of a file, specific range of lines from a file, a function, a method, a file library, a directory, or the like.  Paragraph [0064], If the characteristic is a hash value, then comparison may relate to numerical comparison. If the keyword sequence implementation is used, then comparing may relate to searching for the exact sequence or to a common subsequence, in order to recognize, for example, the presence of a code snippet essentially copied from an open source project in a user file. In some embodiments, substantial similarity, or similarity exceeding a predetermined threshold may be required rather than absolute identity, in order to be able to recognize open source even in cases where the user introduced modifications to the code, such as deleting lines, adding lines, changing names, or the like.  Fig. 3 and paragraph [0079-0086], Storage device 312 may store characteristic determination component 320 for applying one or more algorithms to a source code entity, and obtaining a characteristic such as a numeric value, a sequence of keywords or identifiers, or the like.  Paragraph [0087-0090], Storage device 316 may store characteristic determination component 332 corresponding to characteristic determination component 320, for applying one or more algorithms to a source code entity, and obtaining a characteristic such as a numeric value, a sequence of keywords or identifiers, or the like. Characteristic determination component 332 may operate upon entities associated with known open source projects while creating repository 328.  Goodsitt, column 6, line 44 to column 7, line 17, the comparison device may generate the one or more embeddings using a machine learning model.).  
Claim 10 is rejected for the reasons set forth hereinabove for claim 1, Saas, Langton, Hill and Goodsitt teach the method of claim 1, further comprising determining a centrality importance associated with the source code file, the centrality importance based on version control information(Saas, paragraph [0006], Some licenses may require copyright and notification of the license. Others may require that if a user modified the used open source, for example fixed a bug, the user shares the modified version with other users in the same manner as the original source code was shared. Further licenses may require sharing the users' code developed with the open source with other users. The extent for which sharing is required may vary between files linked with files containing open source, and the whole user project. Further requirements may even have implications on the user's clients which may use the project developed with open source.  Paragraph [0020-0024],  Another technical problem dealt with by the disclosed subject matter is the need to identify which open source project and which version thereof the source code found in the user's project belongs to.  Paragraph [0053].) .  
Claim 11 is rejected, Saas teaches a method, comprising: 
generating, by the cyber security agent, agent embeddings representing the source code file by using a pre-trained machine learning model associated with a cloud-based source code similarity service(Saas, US 20160202972, fig. 1, paragraph [0045-0046], On step 108, the characteristic and an identifier of the entity, such as the file library or the file name, may be stored in the repository.  Fig. 1 and paragraph [0047-0052], On step 104, one or more characteristics may be determined for the entity. Exemplary characteristics may include but are not limited to: a result of a hash function such as SHA-1 function applied to the entity, for example a file, a part of the a such as the first or another specific part of a file, specific range of lines from a file, a function, a method, a file library, a directory, or the like.  Fig. 1 and paragraph [0061], On step 128, the user code characteristic is transmitted from the user's network to a location in which it may be compared to the repository, such as the cloud or a computing platform having access to the cloud, to a server being in communication with the repository, or the like. In some embodiments, the file or entity name, library or another identifier may also be transmitted. In some embodiments, license information extracted from the entity may also be transmitted.  Paragraph [0062], On step 132, the user code entity characteristic may be received, for example by a server. In some embodiments, the repository may be available locally to the user's computing environment, in which case transmitting and receiving the characteristics or other details may be omitted.  Fig. 3 and paragraph [0079-0086], Storage device 312 may store characteristic determination component 320 for applying one or more algorithms to a source code entity, and obtaining a characteristic such as a numeric value, a sequence of keywords or identifiers, or the like.  Paragraph [0087-0090], Storage device 316 may store characteristic determination component 332 corresponding to characteristic determination component 320, for applying one or more algorithms to a source code entity, and obtaining a characteristic such as a numeric value, a sequence of keywords or identifiers, or the like. Characteristic determination component 332 may operate upon entities associated with known open source projects while creating repository 328. Paragraph [0062],  On step 132, the user code entity characteristic may be received, for example by a server. In some embodiments, the repository may be available locally to the user's computing environment, in which case transmitting and receiving the characteristics or other details may be omitted.  Fig. 3 and paragraph [0080], user computing platform 300 and server 302 may be implemented on one device, such as a server.  Paragraph [0091-0094], Storage device 316 may store comparison component 336 for determining whether a given characteristic appears in repository 328.); 
uploading, by the cyber security agent, the agent embeddings to the cloud-based source code similarity service(Saas,   Fig. 1 and paragraph [0061], On step 128, the user code characteristic is transmitted from the user's network to a location in which it may be compared to the repository, such as the cloud or a computing platform having access to the cloud, to a server being in communication with the repository, or the like. In some embodiments, the file or entity name, library or another identifier may also be transmitted. In some embodiments, license information extracted from the entity may also be transmitted.  Paragraph [0062], On step 132, the user code entity characteristic may be received, for example by a server. In some embodiments, the repository may be available locally to the user's computing environment, in which case transmitting and receiving the characteristics or other details may be omitted.); 
 
Saas does not explicitly teach
in response to the kernel notification, suspending, by the cyber security agent, source code file operations requested by the source code file;
in response to the embedding dissimilarity, blocking, by the cyber security agent, the source code file operations
However, Langton teaches
in response to the kernel notification, suspending, by the cyber security agent, source code file operations requested by the source code file(Langton, Column 8, line 37 to 56, In some implementations, security device 220 may counteract data exfiltration by identifying the file as suspicious. In this case, security device 220 may store a malware indicator, in association with the file, that indicates that the file is suspicious (e.g., is malware). In this way, security device 220 and/or another device may use the malware indicator to identify the file as malware, and may perform an action to counteract the malware.  Column 13, line 42 to 59,  In some implementations, exfiltration detection device 240 may counteract data exfiltration by identifying the file as suspicious (e.g., using a malware indicator). In this case, the file may have previously been identified as unsuspicious due to a failure by security device 220 to detect the data exfiltration. As such, exfiltration detection device 240 may update a stored malware indicator, associated with the file, from an indication that the file is unsuspicious to an indication that the file is suspicious);

in response to the embedding dissimilarity, blocking, by the cyber security agent, the source code file operations(Langton, US 10091222, column 6, line 25 to 54,  As shown in FIG. 4, process 400 may include receiving a file to be tested for data exfiltration (block 410). For example, security device 220 may receive a file (e.g., an executable file, an application, a program, etc.) to be tested for data exfiltration. In some implementations, the file may be associated with client device 210 (e.g., may be stored by client device 210, may be executing on client device 210, may be requested by client device 210, etc.). As an example, client device 210 may request a file (e.g., from a website, via an email link, etc.), and security device 220 may receive and/or test the file before the file is provided to client device 210. In some implementations, security device 220 may test the file in a testing environment, such as a sandbox environment.  Column 8, line 25 to 36,  As further shown in FIG. 4, process 400 may include determining whether the exfiltration information is detected in the outbound network traffic (block 450). For example, security device 220 may monitor outbound network traffic to detect whether the outbound network traffic includes the exfiltration information (e.g., a resource identifier, information designed to appear to be sensitive information, etc.). In some implementations, security device 220 may monitor outbound network traffic for plaintext that matches text of the exfiltration information (e.g., text corresponding to a resource identifier, text corresponding to sensitive information, etc.  Column 8, line 37 to 56,  As further shown in FIG. 4, if the exfiltration information is detected in the outbound network traffic (block 450—YES), then process 400 may include performing an action to counteract data exfiltration (block 460). For example, if security device 220 detects the exfiltration information in the outbound network traffic, then security device 220 may perform an action to counteract data exfiltration. In some implementations, security device 220 may counteract data exfiltration by identifying the file as suspicious. In this case, security device 220 may store a malware indicator, in association with the file, that indicates that the file is suspicious (e.g., is malware). In this way, security device 220 and/or another device may use the malware indicator to identify the file as malware, and may perform an action to counteract the malware.   Additionally, or alternatively, security device 220 may counteract data exfiltration by identifying the file (e.g., in memory) and deleting the file from memory. In this way, security device 220 may prevent the file from exfiltrating data.   Column 8, line 57 to column 9, line 5,  If security device 220 determines that the file does not exfiltrate data, then security device 220 may provide the file to client device 210. In this way, security device 220 may prevent a malicious file from exfiltrating data. ).
It would have obvious to one having ordinary skill in the art before the effecting filing date of the claimed invention to combine the teachings of cited references. Thus, one of ordinary skill in the art before the effecting filing date of the claimed invention would have been motivated to incorporate Langton into Saas’s invention to determine whether a file is a data exfiltration malware application when data exfiltration occurs and/or after data exfiltration occurs thus, increasing the likelihood that data exfiltration is detected and improving security of stored information. The Communication interface permits device to receive information from another device and/or provide information to another device. The security device uses the malware indicator to identify the file as malware, and performs an action to counteract the malware. The likelihood of detecting data exfiltration increases thus, providing better information security as suggested by Langton (See abstract and summary).
Saas and Langton do not explicitly teach
receiving, by a cyber security agent, a kernel notification sent by a host operating system detecting a source code file;
However, Hill teaches
receiving, by a cyber security agent, a kernel notification sent by a host operating system detecting a source code file(Hill, US 20160283214, para [0024-0025], agent 50 resides on computer 10. For example, the source code on computer 10 is built. Agent 50 is configured to monitor certain designated files of the built code. The built code is packaged and deployed to server 40, but agent 50 continues to reside on computer 10. Agent 50 monitors the designated files on computer 10 for modifications. If modifications are detected for designated files, agent 50 copies the modified files over network 30 from its source file path on computer 10 to the destination file path on server 40 in near-real time.  Para [0017], an agent monitors the source directory tree for changes to the code and synchronizes those changes in near-real time to the corresponding location in the destination directory tree on the end testing system. The agent is configured with a list of source directories to monitor and a corresponding location on the end testing system for each directory. When the agent detects a change in one of the source directories it is configured to monitor, the agent synchronizes the changed code to destination directories on the end system. As such, the package and deploy process is avoided during the iterative implementation and testing cycle.  Para [0023] and [0025],  Agent 50 must have access to the built code in source file system 22 on computer 10. As an example, agent 50 can reside on a development server running on the same computer 10 as the built source code. As another example, agent 50 can reside on a virtual machine on developer 2's computer 10 and access the built code from a shared directory. As a further example, agent 50 can reside on a remote computer, for example server 40, that has mounted the built code using a remote file system, for example destination file system 62. Agent 50 need not have access to the original source code on source file system 22 as long as agent 50 has access to the built code. For example, Java source files may be converted into Jar files or CoffeeScript files may be converted into JavaScript files. Agent 50 may be configured to monitor any of these files as long as agent 50 has access to the files on source file system 22.);
It would have obvious to one having ordinary skill in the art before the effecting filing date of the claimed invention to combine the teachings of cited references. Thus, one of ordinary skill in the art before the effecting filing date of the claimed invention would have been motivated to incorporate Hill into Saas and Langton’s invention to monitoring multiple files in a source file system at pre-determined time intervals. Multiple files are detected in the source file system that is modified. The modified file is a designated file is determined. The source file path of the modified file is mapped to a corresponding destination file path in a destination file system. The modified file is copied from the source file path in the source file system to the destination file path in the destination file system. as suggested by Hill (See abstract and summary).
Saas, Langton and Hill do not explicitly teach
receiving, by the cyber security agent, an embedding dissimilarity generated by the cloud-based source code similarity service based on the agent embeddings
However, GoodSitt teaches
receiving, by the cyber security agent, an embedding dissimilarity generated by the cloud-based source code similarity service based on the agent embeddings (GoodSitt, US 12,164,867, column 11, line 5 to line 30, The one or more metrics may include a Euclidean distance, a cosine similarity, a Jaccard similarity, and/or a clustering metric, among other examples. For example, a Euclidean distance may indicate a similarity between two (or more) documents. For example, if the Euclidean distance between two embeddings is small, this may indicate that the two vectors are similar, and the documents or words they represent are likely to be related in some way. If the Euclidean distance is large, then the embeddings may be dissimilar, and the documents or words are likely to be unrelated. Cosine similarity may be a measure of similarity between two high-dimensional vectors may calculating a cosine of an angle between the between two high-dimensional vectors. For example, cosine similarity may be a measure of similarity that ranges from −1 to 1, where 1 indicates that the embeddings are identical, 0 indicates that the embeddings are orthogonal (i.e., unrelated), and −1 indicates that the embeddings are diametrically opposed. If the cosine similarity between two embeddings is close to 1, this may indicate that the two embeddings are similar, and the documents or words they represent are likely to be related in some way (e.g., resulting in a higher document similarity score). If the cosine similarity is close to 0, the embeddings may be dissimilar, and the documents or words are likely to be unrelated (e.g., resulting in a lower document similarity score).  Fig. 4 and column 17, line 47 to column 18, line 15, As further shown in FIG. 4, process 400 may include generating, based on comparing the first embedding set to the second embedding set, a code repository similarity score that indicates a similarity between the first code repository and the second code repository (block 450).)
It would have obvious to one having ordinary skill in the art before the effecting filing date of the claimed invention to combine the teachings of cited references. Thus, one of ordinary skill in the art before the effecting filing date of the claimed invention would have been motivated to incorporate GoodSitt into Saas, Langton and Hill’s invention to obtain first document set of documents associated with first code repository. The processors generate first embedding set of embeddings for respective documents included in the first document set and generate document similarity scores for the respective documents included in the first document set based on comparing the first embedding set to second embedding set of embeddings for respective documents included in a second document set of documents associated with second code repository. The first document set include one of a codebase, a code file, a configuration file, a library or a support document.as suggested by Goodsitt (See abstract and summary).  
The Office notes that Goodsitt also teaches 
generating, agent embeddings representing the source code file(Goodsitt, column 6, line 20-59, As shown by reference number 115, the comparison device may generate one or more embeddings for one or more respective documents included in the one or more code repositories. For example, the comparison device may generate one or more embeddings for one or more respective documents included in the first code repository. An embedding (also referred to as an embedding vector) may be a mapping of a discrete (e.g., categorical) variable to a vector (e.g., an embedding vector) of numbers (e.g., continuous numbers). For example, embeddings may be low dimensional, learned continuous vector representations of discrete variables. In other words, embeddings are numerical representations of objects, such as words or images, that are learned by deep learning algorithms from large amounts of data. The embeddings may be high-dimensional, meaning they consist of a large number of features. For example, a model may generate word embeddings (e.g., that enable words with similar meanings to have a similar representation in an embedding space). For example, word embeddings may enable individual words to be represented as real-valued vectors in a predefined embedding space. Each word or phrase (e.g., a set of words) may be mapped to one embedding vector, and the embedding vector values may be learned in a way that resembles how a neural network learns.)
Claim 12 is rejected for the reasons set forth hereinabove for claim 11, Saas, Langton, Hill and Goodsitt teach the method of claim 11, further comprising receiving version control information associated with the source code file (Saas, paragraph [0006],  the user shares the modified version with other users in the same manner as the original source code was shared.  Paragraph [0020-0023],  which version thereof the source code found in the user's project belongs to.  Paragraph [0024-0029], version.).  
Claim 13 is rejected for the reasons set forth hereinabove for claim 11, Saas, Langton, Hill and Goodsitt teach the method of claim 11, further comprising determining a centrality measure using the version control information associated with the source code file(Saas, paragraph [0006], Some licenses may require copyright and notification of the license. Others may require that if a user modified the used open source, for example fixed a bug, the user shares the modified version with other users in the same manner as the original source code was shared. Further licenses may require sharing the users' code developed with the open source with other users. The extent for which sharing is required may vary between files linked with files containing open source, and the whole user project. Further requirements may even have implications on the user's clients which may use the project developed with open source.  Paragraph [0020-0024],  Another technical problem dealt with by the disclosed subject matter is the need to identify which open source project and which version thereof the source code found in the user's project belongs to.  Paragraph [0025-0029], version.  Paragraph [0053].) .  
Claim 14 is rejected for the reasons set forth hereinabove for claim 11, Saas, Langton, Hill and Goodsitt teach the method of claim 11, further comprising determining a centrality importance associated with the source code file, the centrality importance based on version control information(Saas, paragraph [0006], Some licenses may require copyright and notification of the license. Others may require that if a user modified the used open source, for example fixed a bug, the user shares the modified version with other users in the same manner as the original source code was shared. Further licenses may require sharing the users' code developed with the open source with other users. The extent for which sharing is required may vary between files linked with files containing open source, and the whole user project. Further requirements may even have implications on the user's clients which may use the project developed with open source.  Paragraph [0020-0024],  Another technical problem dealt with by the disclosed subject matter is the need to identify which open source project and which version thereof the source code found in the user's project belongs to.  Paragraph [0025-0029], version.  Paragraph [0053].).  
Claim 15 is rejected for the reasons set forth hereinabove for claim 11, Saas, Langton, Hill and Goodsitt teach the method of claim 11, further comprising instructing the host operating system to block the source code file operations(Saas, paragraph [0044], Another technical effect of utilizing the disclosed subject matter is the determination of open source code presence in a user's code in a non-intrusive manner and without having to transmit the code out of the user's network, by transmitting only characteristics of the code, thus avoiding copyright infringement and security hazards, and promoting efficiency since redundant storage, communication volume and intensive comparisons are eliminated.  Paragraph [0096], copyright-protected material.  Paragraph [0022].  Langton, Column 8, line 37 to 56,  As further shown in FIG. 4, if the exfiltration information is detected in the outbound network traffic (block 450—YES), then process 400 may include performing an action to counteract data exfiltration (block 460). For example, if security device 220 detects the exfiltration information in the outbound network traffic, then security device 220 may perform an action to counteract data exfiltration. In some implementations, security device 220 may counteract data exfiltration by identifying the file as suspicious. In this case, security device 220 may store a malware indicator, in association with the file, that indicates that the file is suspicious (e.g., is malware). In this way, security device 220 and/or another device may use the malware indicator to identify the file as malware, and may perform an action to counteract the malware.   Additionally, or alternatively, security device 220 may counteract data exfiltration by identifying the file (e.g., in memory) and deleting the file from memory. In this way, security device 220 may prevent the file from exfiltrating data.    Column 8, line 57 to column 9, line 5,  If security device 220 determines that the file does not exfiltrate data, then security device 220 may provide the file to client device 210. In this way, security device 220 may prevent a malicious file from exfiltrating data.  Langton, Column 13, line 42 to 59,  In some implementations, exfiltration detection device 240 may counteract data exfiltration by identifying the file as suspicious (e.g., using a malware indicator). In this case, the file may have previously been identified as unsuspicious due to a failure by security device 220 to detect the data exfiltration. As such, exfiltration detection device 240 may update a stored malware indicator, associated with the file, from an indication that the file is unsuspicious to an indication that the file is suspicious).  
Claim 16 is rejected for the reasons set forth hereinabove for claim 11, Saas, Langton, Hill and Goodsitt teach the method of claim 11, wherein in response to the receiving of the embedding dissimilarity, further comprising determining the source code file represents proprietary programming(Saas, fig. 2 and paragraph [0071-0076], Pane 216 may provide graphic representation of the results, for example a pie chart indicating the percentage of the user's entities in which open source was not used, and the percentage in which any found open source project is used.  Paragraph [0096], copyright-protected material.  Paragraph [0022].  Langton, Column 8, line 37 to 56,  As further shown in FIG. 4, if the exfiltration information is detected in the outbound network traffic (block 450—YES), then process 400 may include performing an action to counteract data exfiltration (block 460). For example, if security device 220 detects the exfiltration information in the outbound network traffic, then security device 220 may perform an action to counteract data exfiltration. In some implementations, security device 220 may counteract data exfiltration by identifying the file as suspicious. In this case, security device 220 may store a malware indicator, in association with the file, that indicates that the file is suspicious (e.g., is malware). In this way, security device 220 and/or another device may use the malware indicator to identify the file as malware, and may perform an action to counteract the malware.   Additionally, or alternatively, security device 220 may counteract data exfiltration by identifying the file (e.g., in memory) and deleting the file from memory. In this way, security device 220 may prevent the file from exfiltrating data.    Column 8, line 57 to column 9, line 5,  If security device 220 determines that the file does not exfiltrate data, then security device 220 may provide the file to client device 210. In this way, security device 220 may prevent a malicious file from exfiltrating data.  GoodSitt, column 11, line 5 to line 30, The one or more metrics may include a Euclidean distance, a cosine similarity, a Jaccard similarity, and/or a clustering metric, among other examples. For example, a Euclidean distance may indicate a similarity between two (or more) documents. For example, if the Euclidean distance between two embeddings is small, this may indicate that the two vectors are similar, and the documents or words they represent are likely to be related in some way. If the Euclidean distance is large, then the embeddings may be dissimilar, and the documents or words are likely to be unrelated.).
Claim 17 is rejected for the reasons set forth hereinabove for claim 11, Saas, Langton, Hill and Goodsitt teach the method of claim 11, wherein in response to the receiving of the embedding dissimilarity, further comprising determining the source code file represents an intellectual property(Saas, fig. 2 and paragraph [0071-0076], Pane 216 may provide graphic representation of the results, for example a pie chart indicating the percentage of the user's entities in which open source was not used, and the percentage in which any found open source project is used.  Paragraph [0096], copyright-protected material.  Paragraph [0044], by transmitting only characteristics of the code, thus avoiding copyright infringement. Paragraph [0059], copyright.  Paragraph [0022].  Langton, Column 8, line 37 to 56,  As further shown in FIG. 4, if the exfiltration information is detected in the outbound network traffic (block 450—YES), then process 400 may include performing an action to counteract data exfiltration (block 460). For example, if security device 220 detects the exfiltration information in the outbound network traffic, then security device 220 may perform an action to counteract data exfiltration. In some implementations, security device 220 may counteract data exfiltration by identifying the file as suspicious. In this case, security device 220 may store a malware indicator, in association with the file, that indicates that the file is suspicious (e.g., is malware). In this way, security device 220 and/or another device may use the malware indicator to identify the file as malware, and may perform an action to counteract the malware.   Additionally, or alternatively, security device 220 may counteract data exfiltration by identifying the file (e.g., in memory) and deleting the file from memory. In this way, security device 220 may prevent the file from exfiltrating data.    Column 8, line 57 to column 9, line 5,  If security device 220 determines that the file does not exfiltrate data, then security device 220 may provide the file to client device 210. In this way, security device 220 may prevent a malicious file from exfiltrating data.  GoodSitt, column 11, line 5 to line 30, The one or more metrics may include a Euclidean distance, a cosine similarity, a Jaccard similarity, and/or a clustering metric, among other examples. For example, a Euclidean distance may indicate a similarity between two (or more) documents. For example, if the Euclidean distance between two embeddings is small, this may indicate that the two vectors are similar, and the documents or words they represent are likely to be related in some way. If the Euclidean distance is large, then the embeddings may be dissimilar, and the documents or words are likely to be unrelated.).  
Claim 18 is rejected, Saas teaches a memory device storing instructions that, when executed by a central processing unit, perform operations, the operations comprising: 
receiving, by a cyber security agent installed on a computer system, a pre-trained machine learning model trained by a cloud-based source code similarity service using publicly-available open source code (Saas, US 20160202972, Fig. 3 and paragraph [0079-0086], Storage device 312 may store characteristic determination component 320 for applying one or more algorithms to a source code entity, and obtaining a characteristic such as a numeric value, a sequence of keywords or identifiers, or the like.  Paragraph [0087-0090], Storage device 316 may store characteristic determination component 332 corresponding to characteristic determination component 320, for applying one or more algorithms to a source code entity, and obtaining a characteristic such as a numeric value, a sequence of keywords or identifiers, or the like. Characteristic determination component 332 may operate upon entities associated with known open source projects while creating repository 328.  Paragraph [0062],  On step 132, the user code entity characteristic may be received, for example by a server. In some embodiments, the repository may be available locally to the user's computing environment, in which case transmitting and receiving the characteristics or other details may be omitted. Fig. 3 and paragraph [0080], user computing platform 300 and server 302 may be implemented on one device, such as a server.  Paragraph [0091-0094], Storage device 316 may store comparison component 336 for determining whether a given characteristic appears in repository 328.); 
uploading, by the cyber security agent, the agent embedding vector to the cloud-based source code similarity service(Saas,   Fig. 1 and paragraph [0061], On step 128, the user code characteristic is transmitted from the user's network to a location in which it may be compared to the repository, such as the cloud or a computing platform having access to the cloud, to a server being in communication with the repository, or the like. In some embodiments, the file or entity name, library or another identifier may also be transmitted. In some embodiments, license information extracted from the entity may also be transmitted.  Paragraph [0062], On step 132, the user code entity characteristic may be received, for example by a server. In some embodiments, the repository may be available locally to the user's computing environment, in which case transmitting and receiving the characteristics or other details may be omitted.); 
Saas does not explicitly teach
in response to the kernel notification, suspending, by the cyber security agent, source code file operations requested by the source code file;
in response to the receiving of the embedding dissimilarity, denying an exfiltration of the source code file by blocking, by the cyber security agent, the source code file operations.
However, Langton teaches
in response to the kernel notification, suspending, by the cyber security agent, source code file operations requested by the source code file(Langton, Column 8, line 37 to 56, In some implementations, security device 220 may counteract data exfiltration by identifying the file as suspicious. In this case, security device 220 may store a malware indicator, in association with the file, that indicates that the file is suspicious (e.g., is malware). In this way, security device 220 and/or another device may use the malware indicator to identify the file as malware, and may perform an action to counteract the malware.  Column 13, line 42 to 59,  In some implementations, exfiltration detection device 240 may counteract data exfiltration by identifying the file as suspicious (e.g., using a malware indicator). In this case, the file may have previously been identified as unsuspicious due to a failure by security device 220 to detect the data exfiltration. As such, exfiltration detection device 240 may update a stored malware indicator, associated with the file, from an indication that the file is unsuspicious to an indication that the file is suspicious);
in response to the receiving of the embedding dissimilarity, denying an exfiltration of the source code file by blocking, by the cyber security agent, the source code file operations (Langton, US 10091222, column 6, line 25 to 54,  As shown in FIG. 4, process 400 may include receiving a file to be tested for data exfiltration (block 410). For example, security device 220 may receive a file (e.g., an executable file, an application, a program, etc.) to be tested for data exfiltration. In some implementations, the file may be associated with client device 210 (e.g., may be stored by client device 210, may be executing on client device 210, may be requested by client device 210, etc.). As an example, client device 210 may request a file (e.g., from a website, via an email link, etc.), and security device 220 may receive and/or test the file before the file is provided to client device 210. In some implementations, security device 220 may test the file in a testing environment, such as a sandbox environment.  Column 8, line 25 to 36,  As further shown in FIG. 4, process 400 may include determining whether the exfiltration information is detected in the outbound network traffic (block 450). For example, security device 220 may monitor outbound network traffic to detect whether the outbound network traffic includes the exfiltration information (e.g., a resource identifier, information designed to appear to be sensitive information, etc.). In some implementations, security device 220 may monitor outbound network traffic for plaintext that matches text of the exfiltration information (e.g., text corresponding to a resource identifier, text corresponding to sensitive information, etc.  Column 8, line 37 to 56,  As further shown in FIG. 4, if the exfiltration information is detected in the outbound network traffic (block 450—YES), then process 400 may include performing an action to counteract data exfiltration (block 460). For example, if security device 220 detects the exfiltration information in the outbound network traffic, then security device 220 may perform an action to counteract data exfiltration. In some implementations, security device 220 may counteract data exfiltration by identifying the file as suspicious. In this case, security device 220 may store a malware indicator, in association with the file, that indicates that the file is suspicious (e.g., is malware). In this way, security device 220 and/or another device may use the malware indicator to identify the file as malware, and may perform an action to counteract the malware.   Additionally, or alternatively, security device 220 may counteract data exfiltration by identifying the file (e.g., in memory) and deleting the file from memory. In this way, security device 220 may prevent the file from exfiltrating data.    Column 8, line 57 to column 9, line 5,  If security device 220 determines that the file does not exfiltrate data, then security device 220 may provide the file to client device 210. In this way, security device 220 may prevent a malicious file from exfiltrating data.).
It would have obvious to one having ordinary skill in the art before the effecting filing date of the claimed invention to combine the teachings of cited references. Thus, one of ordinary skill in the art before the effecting filing date of the claimed invention would have been motivated to incorporate Langton into Saas’s invention to determine whether a file is a data exfiltration malware application when data exfiltration occurs and/or after data exfiltration occurs thus, increasing the likelihood that data exfiltration is detected and improving security of stored information. The Communication interface permits device to receive information from another device and/or provide information to another device. The security device uses the malware indicator to identify the file as malware, and performs an action to counteract the malware. The likelihood of detecting data exfiltration increases thus, providing better information security as suggested by Langton (See abstract and summary).
Saas and Langton do not explicitly teach
receiving, by the cyber security agent, a kernel notification sent by a host operating system detecting a source code file
However, Hill teaches
receiving, by the cyber security agent, a kernel notification sent by a host operating system detecting a source code file (Hill, US 20160283214, para [0024-0025], agent 50 resides on computer 10. For example, the source code on computer 10 is built. Agent 50 is configured to monitor certain designated files of the built code. The built code is packaged and deployed to server 40, but agent 50 continues to reside on computer 10. Agent 50 monitors the designated files on computer 10 for modifications. If modifications are detected for designated files, agent 50 copies the modified files over network 30 from its source file path on computer 10 to the destination file path on server 40 in near-real time.  Para [0017], an agent monitors the source directory tree for changes to the code and synchronizes those changes in near-real time to the corresponding location in the destination directory tree on the end testing system. The agent is configured with a list of source directories to monitor and a corresponding location on the end testing system for each directory. When the agent detects a change in one of the source directories it is configured to monitor, the agent synchronizes the changed code to destination directories on the end system. As such, the package and deploy process is avoided during the iterative implementation and testing cycle.  Para [0023] and [0025],  Agent 50 must have access to the built code in source file system 22 on computer 10. As an example, agent 50 can reside on a development server running on the same computer 10 as the built source code. As another example, agent 50 can reside on a virtual machine on developer 2's computer 10 and access the built code from a shared directory. As a further example, agent 50 can reside on a remote computer, for example server 40, that has mounted the built code using a remote file system, for example destination file system 62. Agent 50 need not have access to the original source code on source file system 22 as long as agent 50 has access to the built code. For example, Java source files may be converted into Jar files or CoffeeScript files may be converted into JavaScript files. Agent 50 may be configured to monitor any of these files as long as agent 50 has access to the files on source file system 22.);
It would have obvious to one having ordinary skill in the art before the effecting filing date of the claimed invention to combine the teachings of cited references. Thus, one of ordinary skill in the art before the effecting filing date of the claimed invention would have been motivated to incorporate Hill into Saas and Langton’s invention to monitoring multiple files in a source file system at pre-determined time intervals. Multiple files are detected in the source file system that is modified. The modified file is a designated file is determined. The source file path of the modified file is mapped to a corresponding destination file path in a destination file system. The modified file is copied from the source file path in the source file system to the destination file path in the destination file system. as suggested by Hill (See abstract and summary).
Saas, Langton and Hill do not explicitly teach
generating, by the cyber security agent, an agent embedding vector associated with the source code file by using the pre-trained machine learning model trained by the cloud-based source code similarity service using the publicly-available open source code ;
receiving, by the cyber security agent, an embedding dissimilarity generated by the cloud-based source code similarity service indicating the agent embedding vector is dissimilar to the publicly-available open source code;
However, GoodSitt teaches
generating, by the cyber security agent, an agent embedding vector associated with the source code file by using the pre-trained machine learning model trained by the cloud-based source code similarity service using the publicly-available open source code (Goodsitt, column 6, line 20-59, As shown by reference number 115, the comparison device may generate one or more embeddings for one or more respective documents included in the one or more code repositories. For example, the comparison device may generate one or more embeddings for one or more respective documents included in the first code repository. An embedding (also referred to as an embedding vector) may be a mapping of a discrete (e.g., categorical) variable to a vector (e.g., an embedding vector) of numbers (e.g., continuous numbers). For example, embeddings may be low dimensional, learned continuous vector representations of discrete variables. In other words, embeddings are numerical representations of objects, such as words or images, that are learned by deep learning algorithms from large amounts of data. The embeddings may be high-dimensional, meaning they consist of a large number of features. For example, a model may generate word embeddings (e.g., that enable words with similar meanings to have a similar representation in an embedding space). For example, word embeddings may enable individual words to be represented as real-valued vectors in a predefined embedding space. Each word or phrase (e.g., a set of words) may be mapped to one embedding vector, and the embedding vector values may be learned in a way that resembles how a neural network learns.);
receiving, by the cyber security agent, an embedding dissimilarity generated by the cloud-based source code similarity service indicating the agent embedding vector is dissimilar to the publicly-available open source code (GoodSitt, US 12,164,867, column 11, line 5 to line 30, The one or more metrics may include a Euclidean distance, a cosine similarity, a Jaccard similarity, and/or a clustering metric, among other examples. For example, a Euclidean distance may indicate a similarity between two (or more) documents. For example, if the Euclidean distance between two embeddings is small, this may indicate that the two vectors are similar, and the documents or words they represent are likely to be related in some way. If the Euclidean distance is large, then the embeddings may be dissimilar, and the documents or words are likely to be unrelated. Cosine similarity may be a measure of similarity between two high-dimensional vectors may calculating a cosine of an angle between the between two high-dimensional vectors. For example, cosine similarity may be a measure of similarity that ranges from −1 to 1, where 1 indicates that the embeddings are identical, 0 indicates that the embeddings are orthogonal (i.e., unrelated), and −1 indicates that the embeddings are diametrically opposed. If the cosine similarity between two embeddings is close to 1, this may indicate that the two embeddings are similar, and the documents or words they represent are likely to be related in some way (e.g., resulting in a higher document similarity score). If the cosine similarity is close to 0, the embeddings may be dissimilar, and the documents or words are likely to be unrelated (e.g., resulting in a lower document similarity score).  Fig. 4 and column 17, line 47 to column 18, line 15, As further shown in FIG. 4, process 400 may include generating, based on comparing the first embedding set to the second embedding set, a code repository similarity score that indicates a similarity between the first code repository and the second code repository (block 450).)
It would have obvious to one having ordinary skill in the art before the effecting filing date of the claimed invention to combine the teachings of cited references. Thus, one of ordinary skill in the art before the effecting filing date of the claimed invention would have been motivated to incorporate GoodSitt into Saas, Langton and Hill’s invention to obtain first document set of documents associated with first code repository. The processors generate first embedding set of embeddings for respective documents included in the first document set and generate document similarity scores for the respective documents included in the first document set based on comparing the first embedding set to second embedding set of embeddings for respective documents included in a second document set of documents associated with second code repository. The first document set include one of a codebase, a code file, a configuration file, a library or a support document.as suggested by GoodSitt (See abstract and summary).

Claim 19 is rejected for the reasons set forth hereinabove for claim 18, Saas, Langton, Hill and Goodsitt teach the memory device of claim 18, wherein the operations further comprise receiving version control information associated with the source code file(Saas, paragraph [0006],  the user shares the modified version with other users in the same manner as the original source code was shared.  Paragraph [0020-0023],  which version thereof the source code found in the user's project belongs to.  Paragraph [0024-0029], version.).  
Claim 20 is rejected for the reasons set forth hereinabove for claim 18, Saas, Langton, Hill and Goodsitt teach the memory device of claim 18, wherein the operations further comprise determining a centrality importance associated with the source code file, the centrality importance based on version control information(Saas, paragraph [0006], Some licenses may require copyright and notification of the license. Others may require that if a user modified the used open source, for example fixed a bug, the user shares the modified version with other users in the same manner as the original source code was shared. Further licenses may require sharing the users' code developed with the open source with other users. The extent for which sharing is required may vary between files linked with files containing open source, and the whole user project. Further requirements may even have implications on the user's clients which may use the project developed with open source.  Paragraph [0020-0024],  Another technical problem dealt with by the disclosed subject matter is the need to identify which open source project and which version thereof the source code found in the user's project belongs to.  Paragraph [0025-0029], version.  Paragraph [0053].) .

8.	Claim 6-7 and 9 rejected under 35 U.S.C. 103 as being obvious over Saas (US 20160202972, herein after Saas – IDS of records - herein after Saas), in view of Langton et al. (US 10091222, herein after Langton), in view of Hill et al. (US 20160283214, herein after Hill), in view of Goodsitt et al. (US 12,164,867, herein after Goodsitt)  and further in view of Calcagno et al. (US 20140165045, herein after Calcagno).
With respect to claim 6, Saas, Langton, Hill and Goodsitt do not teach all limitations of claim 6.
However, Calcagno teaches
Claim 6 is rejected for the reasons set forth hereinabove for claim 1, Saas, Langton,  Hill, Goodsitt and Calcagno teach the method of claim 1, further comprising determining a file centrality importance associated with the source code file based on a programming link between the source code file and a different source code file (Calcagno, US 20140165045, fig. 2 and paragraph [0034],  Although it is the same type of defect in each case, a sensible goal is to fix the one that, in the larger picture, has a greater impact on the overall quality of the software project. The call graph shows that P7 is called by six (6) other procedures (taking into account the transitive closure of the call relation), whereas P3 is only called by one (1) other procedure. Intuitively, a developer working in a bottom-up fashion would want to fix the leak in P7 since having this procedure operate correctly is outwardly more central to the proper operation of the whole project. Another developer working in a top-down fashion might instead want to fix P3 first, since P3 calls one procedure while P7 calls none. The definition of call rank has parameters to specify the relative importance of in-calls and out-calls, to cater for a range of possible uses.).  
It would have obvious to one having ordinary skill in the art before the effecting filing date of the claimed invention to combine the teachings of cited references. Thus, one of ordinary skill in the art before the effecting filing date of the claimed invention would have been motivated to incorporate Calcagno into Saas, Langton, Hill and Goodsitt ‘s invention to display software quality for computing system.   The graphical representation of software quality allows different users to examine different aspects of the software and to focus on changes and their effect in the context of the overall software project, thus improves productivity, quality and reliability of software suggested by Calcagno (See abstract and summary).
Claim 7 is rejected for the reasons set forth hereinabove for claim 1, Saas, Langton,  Hill, Goodsitt and Calcagno teach the method of claim 1, further comprising determining a file centrality importance associated with the source code file based on a page rank (Calcagno, fig. 2 and paragraph [0034],  Although it is the same type of defect in each case, a sensible goal is to fix the one that, in the larger picture, has a greater impact on the overall quality of the software project. The call graph shows that P7 is called by six (6) other procedures (taking into account the transitive closure of the call relation), whereas P3 is only called by one (1) other procedure. Intuitively, a developer working in a bottom-up fashion would want to fix the leak in P7 since having this procedure operate correctly is outwardly more central to the proper operation of the whole project. Another developer working in a top-down fashion might instead want to fix P3 first, since P3 calls one procedure while P7 calls none. The definition of call rank has parameters to specify the relative importance of in-calls and out-calls, to cater for a range of possible uses.).  
It would have obvious to one having ordinary skill in the art before the effecting filing date of the claimed invention to combine the teachings of cited references. Thus, one of ordinary skill in the art before the effecting filing date of the claimed invention would have been motivated to incorporate Calcagno into Saas, Langton Hill and Goodsitt’s invention to display software quality for computing system.   The graphical representation of software quality allows different users to examine different aspects of the software and to focus on changes and their effect in the context of the overall software project, thus improves productivity, quality and reliability of software suggested by Calcagno (See abstract and summary).
Claim 9 is rejected for the reasons set forth hereinabove for claim 1, Saas, Langton, Hill, Goodsitt and Calcagno teach the method of claim 1, further comprising determining a centrality measure associated with the source code file  (Calcagno, fig. 2 and paragraph [0034],  Although it is the same type of defect in each case, a sensible goal is to fix the one that, in the larger picture, has a greater impact on the overall quality of the software project. The call graph shows that P7 is called by six (6) other procedures (taking into account the transitive closure of the call relation), whereas P3 is only called by one (1) other procedure. Intuitively, a developer working in a bottom-up fashion would want to fix the leak in P7 since having this procedure operate correctly is outwardly more central to the proper operation of the whole project. Another developer working in a top-down fashion might instead want to fix P3 first, since P3 calls one procedure while P7 calls none. The definition of call rank has parameters to specify the relative importance of in-calls and out-calls, to cater for a range of possible uses.).  
It would have obvious to one having ordinary skill in the art before the effecting filing date of the claimed invention to combine the teachings of cited references. Thus, one of ordinary skill in the art before the effecting filing date of the claimed invention would have been motivated to incorporate Calcagno into Saas, Langton, Hill and Goodsitt’s invention to display software quality for computing system.   The graphical representation of software quality allows different users to examine different aspects of the software and to focus on changes and their effect in the context of the overall software project, thus improves productivity, quality and reliability of software suggested by Calcagno (See abstract and summary).

Inquiry

Any inquiry concerning this communication or earlier communications from the examiner should be directed to DUY KHUONG THANH NGUYEN whose telephone number is (571)270-7139. The examiner can normally be reached Monday - Friday 0800-1630.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Lewis Bullock can be reached at 5712723759. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/DUY KHUONG T NGUYEN/           Primary Examiner, Art Unit 2199
Read full office action
Prosecution Timeline

Show 6 earlier events
Oct 02, 2025
Examiner Interview Summary
Oct 02, 2025
Applicant Interview (Telephonic)
Nov 07, 2025
Response after Non-Final Action
Dec 06, 2025
Request for Continued Examination
Dec 18, 2025
Response after Non-Final Action
Jan 12, 2026
Non-Final Rejection mailed — §103
Feb 10, 2026
Applicant Interview (Telephonic)
Feb 10, 2026
Examiner Interview Summary
Precedent Cases

Applications granted by this same examiner with similar technology

18/561,590
Patent 12632237
HIERARCHICAL COMPILING AND EXECUTION IN A MACHINE LEARNING HARDWARE ACCELERATOR
2y 6m to grant Granted May 19, 2026
18/394,357
Patent 12625796
TESTING AUTOMATION FOR OPEN STANDARD CLOUD SERVICES APPLICATIONS
2y 4m to grant Granted May 12, 2026
18/507,562
Patent 12619421
METHOD FOR UPDATING APPLICATION AND ELECTRONIC DEVICE THEREOF
2y 5m to grant Granted May 05, 2026
17/573,512
Patent 12613687
METHODS AND SYSTEMS FOR DYNAMICALLY CREATING UPGRADE SPECIFICATIONS BASED ON PER DEVICE CAPABILITIES
4y 3m to grant Granted Apr 28, 2026
17/802,671
Patent 12613791
METHOD AND DEVICE FOR TESTING THE COMPATIBILITY BETWEEN APPLICATION SOFTWARE AND A MOBILE WORKING MACHINE
3y 8m to grant Granted Apr 28, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

3-4
Expected OA Rounds
82%
Grant Probability
99%
With Interview (+34.6%)
2y 8m (~0m remaining)
Median Time to Grant
High
PTA Risk
Based on 546 resolved cases by this examiner. Grant probability derived from career allowance rate.