CLUSTERING SOFTWARE VULNERABILITIES USING SELF-ORGANIZING MAPS: OBSERVATIONS AND ANALYSIS

Khyati Panchal

doi:10.7273/000004477

Back

CLUSTERING SOFTWARE VULNERABILITIES USING SELF-ORGANIZING MAPS: OBSERVATIONS AND ANALYSIS

Thesis

Open access

CLUSTERING SOFTWARE VULNERABILITIES USING SELF-ORGANIZING MAPS: OBSERVATIONS AND ANALYSIS

Khyati Panchal

Washington State University

Master of Science (MS), Washington State University

01/2022

DOI:

https://doi.org/10.7273/000004477

Handle:

https://hdl.handle.net/2376/124500

Files and links (1)

pdf

Khyati_Panchal_Thesis_Final (5)1.36 MBDownload View

CC BY V4.0, Open Access

Abstract

Clustering

DBI

SOM

The goal of this research is to analyze patterns in the annotation of common vulnerabilities and exposures (CVE). One way to express these patterns is to relate CVEs to classes in Common Weakness Enumeration(CWE). Our research aims to improve this using the extensive annotation in the National Vulnerability Database (NVD). To this end, we process information from the NVD using the natural language processing model V2W-BERT, which generates a large tabular database of approximately 137,226 records each characterizing an annotation by a vector with 768 numerical attributes. Given the data in vector form, we are using the unsupervised machine learning tools to discover patterns through clustering. One of the tool we are using is Self-Organizing Maps(SOM), a well-established technique of data compression. We expect at least a 10-fold data compression, which means a SOM output array of 6417nodes from the full dataset. We are investigating the most informative way to interpret the SOM output array. For example, we have investigated how we can use the SOM generated codebooks of the output array to suggest the number of clusters in a K-means representation of the tabular data, followed by trace-back to the annotation to assign labels to the clusters.

Metrics

2 File views/ downloads

22 Record Views

Details

Title: CLUSTERING SOFTWARE VULNERABILITIES USING SELF-ORGANIZING MAPS
Creators: Khyati Panchal
Contributors: John Miller (Advisor)
Luis DeLaTorre (Committee Member)
Mahantesh Halappanavar (Committee Member)
Awarding Institution: Washington State University
Academic Unit: Engineering and Applied Sciences (TRIC), School of
Theses and Dissertations: Master of Science (MS), Washington State University
Publisher: Washington State University
Number of pages: 69
Identifiers: 99900882138101842
Language: English
Resource Type: Thesis