Required Features (30 Columns)
URL Structure
having_IP_AddressURL_LengthShortining_Servicehaving_At_Symboldouble_slash_redirectingPrefix_Suffixhaving_Sub_DomainAbnormal_URL
Security
SSLfinal_StateHTTPS_tokenportFaviconStatistical_report
Page Content
Request_URLURL_of_AnchorLinks_in_tagsSFHSubmitting_to_email
Page Behavior
Redirecton_mouseoverRightClickpopUpWidnowIframe
Domain Info
Domain_registeration_lengthage_of_domainDNSRecordweb_trafficPage_RankGoogle_IndexLinks_pointing_to_page
Value Range: All features should be integers with values:
-1 (Phishing indicator),
0 (Suspicious), or
1 (Legitimate indicator)
How It Works
1
Data Ingestion
Raw data pulled from MongoDB and split into train/test sets
2
Validation
Schema validation & KS drift detection
3
Transformation
KNN Imputation & target mapping
4
Model Training
5 models evaluated, best selected by F1 score
5
Deployment
Model served via FastAPI on Hugging Face