-
Notifications
You must be signed in to change notification settings - Fork 0
Dev #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Dev #1
Conversation
|
|
||
| train = pd.read_csv(os.path.join(os.path.dirname(__file__), 'data', 't.csv'), header=0, delimiter="\t", quoting=3) #открывается обучающий датасет | ||
|
|
||
| train = train[:5000] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
а почему и здесь и при обучении deepmoji все только по 5000 твит обрезается
| model = LinearSVC(penalty='l2', loss='squared_hinge', dual=True, tol=0.0001, C=1.0, multi_class='ovr', | ||
| fit_intercept=True, intercept_scaling=1, class_weight=None, verbose=0, random_state=None, max_iter=1000) | ||
| model.fit(X, y) | ||
| print ("20 Fold CV Score. Bag of words: ", np.mean(cross_validation.cross_val_score(model, X, y, cv=20, scoring='roc_auc'))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
другая модель без кроссвалидации ведь проверяется?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Да, но это просто старый код, я его не меняла.
|
|
||
| def train_model(nb_classes, DATASET_PATH, DATASET_PATH_PRETRAINED = '', | ||
| PRETRAINED_PATH='', delete_non_raws = False, save_model = False): | ||
| vocab = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
а что вот это за слова, кстати?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
это те теги, которые добавляются в препроцессинге у авторов
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
А для русского они тоже нужны?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ну вот CUSTOM_URL и CUSTOM_NUMBER не зависят от языка, но по идее нужно будет проверить
|
|
||
| def review_to_wordlist( review, remove_stopwords=False ): | ||
| # review_text = BeautifulSoup(review).get_text() | ||
| review_text = review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
)))
| df['sent'].append(emoji_dict[emoji_name]) | ||
| return df | ||
|
|
||
| df = {'text':[], 'id':[], 'sent':[]} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if name == 'main'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
вот ты зануда)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
А вот и нет. Я импортнул отсюда словарь и у меня вышла ошибка, что какого-то файлика не хватает
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
А, ну я его для других целей создавала просто)
No description provided.