Skip to content

A client has a system that collects news artifacts from web pages, tweets, facebook posts, etc. The client is interested in scoring a given new artifact against a topic. The client has hired experts to score a few of these news items in the range from 0 to 10; a score of 0 means the news item is totally NOT relevant while a score of 10 means the…

License

Notifications You must be signed in to change notification settings

Nathnael12/Prompt-engineering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

53 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

10 Academy

Prompt Engineering: In-context learning with GPT-3 and other Large Language Models

Project Overview

A client has a system that collects news artifacts from web pages, tweets, facebook posts, etc. The client is interested in scoring a given new artifact against a topic. The client has hired experts to score a few of these news items in the range from 0 to 10; a score of 0 means the news item is totally NOT relevant while a score of 10 means the news item is very relevant. The range of results between 0 and 10 signifies the degree of relevance of the news item to the topic.

The client wants to explore how useful existing LLMs such as GPT-3 are for this task. You are hired as a consultant to explore the efficiency of GPT3-like LLMs to this task. If your recommendation is positive, you must demonstrate that your strategies to design prompts are reproducible and produce a consistent result.

You should also set up an MLOps pipeline that helps automate the task of using different LLMs and different topics. Your pipeline should also allow future improvements in the prompt design to be integrated without breaking the system. A centralized log system should be incorporated into your pipeline to help monitor outputs, cost, performance, and other relevant artifacts.


Data

Our data is versioned using DVC

news - For now we have only one virsion of news data

  • news-v0 : original version of the data
  • news-v1 : first stage cleaned news data
  • test-news-v1 : enhanced test data
  • test-news-v2 : 2nd enhanced test data
  • test_news-v0 : track test news data
  • train-news-v1 : enhanced train data
  • train-news-v2 : 2nd enhanced train data
  • train_news-v0 : track train news data

Project Structure

The directories for this project is self-explanatory. You can find the api (for making predictions) setup in api folder. The versioned data in data folder. notebook directory contains the notebooks for this project. You can find helper classes in scripts directory.

This project uses co:here api for making predictions. Thus you need to have your own api_key.

create config.py file in the root directory then place your api key as follows
api_key = "**************"

If you want to fine tune your model, you can find tuner.txt file in ./data/ directory. Use this file for finetuning co:here Generate

Installation Guide

git clone https://github.com/Nathnael12/Prompt-engineering.git
cd Prompt-engineering
pip install -r requirements.txt

fastAPI

You will find it in the api directory. There are three endpoints included

  • {host:port}/check used for checking whether or not our API is up
  • {host:port}/bnewscore used for predicting news scores
  • {host:port}/jdentities used for extracting job entities

for this project you will use host:port = http://127.0.0.1:8000/
you can start the api by the following command

cd api
uvicorn app:app --reload

The above command should start your api at http://127.0.0.1:8000/

About

A client has a system that collects news artifacts from web pages, tweets, facebook posts, etc. The client is interested in scoring a given new artifact against a topic. The client has hired experts to score a few of these news items in the range from 0 to 10; a score of 0 means the news item is totally NOT relevant while a score of 10 means the…

Resources

License

Stars

Watchers

Forks

Packages

No packages published