Skip to content

prichakrabarti/h1b_hackathon

Repository files navigation

h1b_hackathon

H1B Disclosure Dataset: Predicting the case Status

This is the solution to a Hackathon problem statement provided by GreyAtom The Jupyter notebook contains all code used to solve this dataset.

This repo containts 3 files

  1. Jupyter notebook- this notebook was used for doing an end to end solution and experimenting with different classifier models. After testing across different models, the Random Forest Classifier was chosen as it gave the highest accuracy.
  2. h1b preprocessing python file - this is the file that was used to preprocess the data to prepare it for model training and testing.
  3. h1b train and test python file - this was the code for training and testing our Random Forest Classifier.

Problem Statement:

The H-1B Dataset selected for this project contains data from employer’s Labor Condition Application and the case certification determinations processed by the Office of Foreign Labor Certification (OFLC) where the date of the determination was issued on or after October 1, 2016, and on or before June 30, 2017. The Labor Condition Application (LCA) is a document that a prospective H-1B employer files with U.S. Department of Labor Employment and Training Administration (DOLETA) when it seeks to employ nonimmigrant workers at a specific job occupation in an area of intended employment for not more than three years. The goal for this project is to predict the case status of an application submitted by the employer to hire non-immigrant workers under the H-1B visa program. The employer can hire non-immigrant workers only after their LCA petition is approved. The approved LCA petition is then submitted as part of the Petition for a Non-immigrant Worker application for work authorizations for H-1B visa status.

About

This is the solution to a Hackathon problem statement provided by GreyAtom

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published