Regression model #4

ljwoods2 · 2025-10-02T16:06:24Z

No description provided.

ljwoods2 · 2025-10-02T17:13:10Z

notebooks/regression.ipynb

+    "train_bools = []\n",
+    "epitope_bools = []\n",
+    "rsa_vals = []\n",
+    "for (esm_emb, seq, train_boolmask, epitope_boolmask, rsa) in bp3.iter_rows():\n",


Just a quality of life thing, you can always do this if you have a large number of rows and don't want to unpack the tuple and name everything

for row in bp3.iter_rows(named=True): esm_emb = row["esm_emb"]

ljwoods2 · 2025-10-02T17:17:08Z

notebooks/regression.ipynb

+    }
+   ],
+   "source": [
+    "# --- Transform to Per-Residue Basis ---\n",


This cell could also be replaced with something like:

bp3.explode("esm_emb", pl.col("seq").str.split(""), "train_boolmask", "epitope_boolmask", "RSA")

although this would require you to load the esm_emb object into the dataframe as a list rather than a tensor, which polars doesn't know the length of

ljwoods2 · 2025-10-02T17:19:58Z

notebooks/regression.ipynb

+    "    esm_embeddings = []\n",
+    "    for job_num in range(bp3.shape[0]):\n",
+    "        job_name = bp3.select(\"job_name\")[job_num].item()\n",
+    "        esm_embeddings.append(torch.load(ESM_ENCODING_DIR / (job_name + \".pt\")))\n",


I would change this to be a list since they can be converted into polars lists when inserted as a column and play nice with polars- notice how in the printed dataframe polars thinks it is an opaque "object"/binary blob and polars operations won't work on it

torch.load(ESM_ENCODING_DIR / (job_name + ".pt")).tolist()

ljwoods2 · 2025-10-02T17:23:24Z

notebooks/regression.ipynb

+    "X_df = train_df[agg_features]\n",
+    "y_df = train_df[\"epitope_bools\"]\n",
+    "\n",
+    "X = X_df.values\n",


Polars version of this is

bp3_res.select(agg_features).to_numpy()

ljwoods2 · 2025-10-02T17:24:25Z

notebooks/regression.ipynb

+    "    y_train, y_test = y[train_index], y[test_index]\n",
+    "\n",
+    "    # --- Scale Features ---\n",
+    "    scaler = StandardScaler()  \n",


Does bepipred scale embedding features? Just curious if this is necessary

jacobsesate added 2 commits September 30, 2025 22:19

preliminary model pipeline (regression)

23d6bd4

basic neural net implemented AUC 0.76

d76a30c

ljwoods2 commented Oct 2, 2025

View reviewed changes

added network representation and analysis; finalized LR model

595ad6d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Regression model #4

Regression model #4

Uh oh!

ljwoods2 commented Oct 2, 2025

Uh oh!

ljwoods2 Oct 2, 2025

Uh oh!

ljwoods2 Oct 2, 2025

Uh oh!

ljwoods2 Oct 2, 2025

Uh oh!

ljwoods2 Oct 2, 2025

Uh oh!

ljwoods2 Oct 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Regression model #4

Are you sure you want to change the base?

Regression model #4

Uh oh!

Conversation

ljwoods2 commented Oct 2, 2025

Uh oh!

ljwoods2 Oct 2, 2025

Choose a reason for hiding this comment

Uh oh!

ljwoods2 Oct 2, 2025

Choose a reason for hiding this comment

Uh oh!

ljwoods2 Oct 2, 2025

Choose a reason for hiding this comment

Uh oh!

ljwoods2 Oct 2, 2025

Choose a reason for hiding this comment

Uh oh!

ljwoods2 Oct 2, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants