BioTradingArena

Linear Regression + Search

Linear Regression + SearchClaude & GPT-4

Ridge regression with market cap, momentum, PR/trial LLM features, and pre-event web search features. Alpha tuned via inner cross-validation. Combines both LLM feature extraction and real-time market context.

Overview

Strategy Type

Linear Regression + Search

Number of Steps

2 steps

Models Used

Claude & GPT-4

Output Format

Regression features (numeric features for linear model)

Pipeline Visualization

LLM Feature Extraction (PR/Trial)

LLM prompt execution

Pre-Event Market Search Features

Automated web search (no LLM call)

Prompt Details

LLM Feature Extraction (PR/Trial)

System Prompt

You are a biotech equity analyst. Extract numeric features from the press release and clinical trial data to help a linear regression predict immediate post-catalyst percent change. Use only information known at or before the catalyst date. Output strict JSON only.

User Prompt Template

Provide the following fields (JSON only): { "expected_direction": float in [-1,1], "expected_magnitude": float in [0,1], "trial_strength": float in [0,1], "safety_risk": float in [0,1], "priced_in": float in [0,1], "surprise": float in [0,1], "data_quality_confidence": float in [0,1] } CATALYST INFO: - Ticker: {ticker} - Company: {company} - Drug: {drug} - Phase: {phase} - Indication: {indication} - Event Type: {cat_type} CLINICAL TRIAL (JSON): {clinical_trial} PRESS RELEASE: {pr_text}

Expected Output Format

This prompt expects a JSON response. See the user prompt template for the exact structure.

Pre-Event Market Search Features

Automated

System Prompt

You are a biotech equity analyst. Using ONLY the pre-event search results, quantify market context features. Output strict JSON.

User Prompt Template

Event Date: {event_date} Search Results (JSON): {search_json} Return JSON with: { "analyst_sentiment": float in [-1,1], "pre_event_buzz": float in [0,1], "priced_in": float in [0,1], "surprise": float in [0,1], "expectations_miss_risk": float in [0,1], "leak_risk": float in [0,1], "evidence_strength": float in [0,1] }

Expected Output Format

This prompt expects a JSON response. See the user prompt template for the exact structure.

Template Variables Reference

These variables are dynamically replaced with actual values when the strategy is executed:

{
  "analyst_sentiment": float in [-1,1],
  "pre_event_buzz": float in [0,1],
  "priced_in": float in [0,1],
  "surprise": float in [0,1],
  "expectations_miss_risk": float in [0,1],
  "leak_risk": float in [0,1],
  "evidence_strength": float in [0,1]
}

Custom variable

{
  "expected_direction": float in [-1,1],
  "expected_magnitude": float in [0,1],
  "trial_strength": float in [0,1],
  "safety_risk": float in [0,1],
  "priced_in": float in [0,1],
  "surprise": float in [0,1],
  "data_quality_confidence": float in [0,1]
}

Custom variable

{cat_type}Catalyst event type (e.g., FDA approval, trial results)

{clinical_trial}Clinical trial data in JSON format

{company}Company name

{drug}Drug or therapy name

{event_date}Date of the catalyst event

{indication}Medical indication/disease being treated

{phase}Clinical trial phase (1, 2, 3, etc.)

{pr_text}Full press release text

{search_json}Web search results in JSON format

{ticker}Company stock ticker symbol