Text-Attributed-Graph on Patent Data

This repository contains code to build a text-attributed graph from patent data. The graph represents patents as nodes and connects them based on semantic similarity of their textual content.

Features

Load and preprocess patent data from CSV files
Generate semantic embeddings using SentenceTransformers
Build a similarity graph based on customizable thresholds

Patent Graph Dataset

is located in CS594PatentProject/output folder

Generate Text Embeddings

Go to GNN/embedding.py
Set the model you want to use (e.g., all-MiniLM-L6-v2)
Embeddings will be saved to: output/embeddings/MiniLML6.npy

Data Loader

Go to GNN/data_loader.py
Set the appropriate embedding file:

embedding_path = os.path.join(root, "output", "embeddings", "MiniLML6.npy")

Train the GNN Model

To train a GNN on the patent graph

Open GNN/train.py
Choose the GNN model by modifying the line: model = GCN(patent_data.num_node_features, 64, len(torch.unique(patent_data.y))) Replace GCN with the specific GNN model you want to use, such as GCN_deep, GraphSAGE, GAT, GIN, etc.

Generate explanation

Open TAPE_modified and run explanation.py for explanation generation
Put your huggingface access token in the line: hf_token = "hf_token"

Authors

Homaira Huda Shomee (hshome2@uic.edu) Ataher Sams (asams3@uic.edu)

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
All Reports		All Reports
GNN		GNN
TAPE_modified		TAPE_modified
__pycache__		__pycache__
data		data
data_subclass_wise		data_subclass_wise
output		output
.DS_Store		.DS_Store
config.py		config.py
data_loader.py		data_loader.py
data_merging.py		data_merging.py
embedding.py		embedding.py
main.py		main.py
readme.md		readme.md
requirements.txt		requirements.txt
similarity.py		similarity.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Text-Attributed-Graph on Patent Data

Features

Patent Graph Dataset

Generate Text Embeddings

Data Loader

Train the GNN Model

Generate explanation

Authors

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

hhshomee/CS594PatentProject

Folders and files

Latest commit

History

Repository files navigation

Text-Attributed-Graph on Patent Data

Features

Patent Graph Dataset

Generate Text Embeddings

Data Loader

Train the GNN Model

Generate explanation

Authors

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages