This project focuses on automating the visa application screening process by predicting the approval likelihood of visa petitions. By analyzing historical applicant and employer data, the project aims to help immigration agencies and consultancies streamline application reviews and improve decision-making efficiency.
The objective was to develop a machine learning model that predicts whether a visa application will be approved based on applicant qualifications, employer attributes, and job-related factors. The solution can significantly reduce manual screening time, improve approval accuracy, and enhance operational efficiency.
- Source: Provided as part of the project coursework
- Size: ~10,000+ visa application records
- Key Features:
- Applicant details (education, experience, salary)
- Employer attributes (industry, size, region)
- Job-related data (job title, SOC code, full-time status)
- Target: Visa Status (
1= Approved,0= Denied)
- Data Preprocessing – Cleaned and prepared the dataset by handling missing values, encoding categorical features, and standardizing variables.
- Exploratory Data Analysis (EDA) – Explored approval patterns and key influencing factors using visualizations and statistical techniques.
- Model Development – Built and evaluated multiple classification models to predict visa approval outcomes.
- Insights & Recommendations – Identified top predictive features and recommended strategies to improve visa approval success rates.
- Identified key determinants of visa approval, such as education level, job classification, and offered salary.
- Delivered a predictive framework that can assist agencies in faster and more consistent decision-making.
- Enabled strategic recommendations for applicants and employers to improve approval chances.
- Language: Python
- Libraries: Pandas, NumPy, Matplotlib, Seaborn, Scikit-learn
- Tools: Jupyter Notebook / Google Colab
Sandesh S. Badwaik