Skip to content

Adding New Data Types to be Scanned in the by the Django Backend #24

@chw3k5

Description

@chw3k5

Extending TeleView's data indexing to other (non-smurf-data) formats is something that was intently designed into the TeleView project during initial development.

The scope of this issue to consider the steps needed to upgrade TeleView's backend Django project and the MongoDB database. Once this task is completed, a new issue can be created to consider upgrades to the frontend, which will be less abstract and depend highly on the new data types and their use/value to users.

We expected to add 3 or fewer new data types over the lifetime of TeleView. The addition of new data is expected, but not overly abstracted. Imagine you are a developer and want to add a new data type called elves, then anywhere you see a file or function named smurf (the data type used for initial development) you will need to make a new parallel file or function called elves.

Warning

This is not a comprehensive guild to adding new data types to TeleView, this is only an overview of major systems that must be upgraded.

New data location, naming, and parsing

  1. TeleView was designed to find all data under a single location. Externally, this data location is specified in the .env file under the variable name TELEVIEW_PLATFORMS_DATA_DIR, we will use platforms_dir to talk about this this location in the rest of the issue report.

  2. The platforms_dir is expected to have any number of subdirectories, each sub directory is considered a platform_name.

  3. The location platform_dir/platform_name is searched to find subdirectories with a name that matches a data_scraper_function in find.py. These are functions collected at Python import-time based on the decorator @data_scraper. At the time of writing ,this only includes a single function smurf see the included image below.

image
  1. A data-specific parsing generator must be designed for that type data type. This function must be generator so that single threaded memory usages remains stable for any sized file system being scrapped/scanned. See the smurf generator in the image and link provided in 3.

Important

New data must be specified in a directory per-platform.

Important

To be included in the MongoDB, a data-specific parsing generator is expected to have the same name a the directory where the data is found. I.e. the smurf directory houses data that of a type the is parsed be the smurf generator.

New data upload to database

Database uploads occur in the Python class DatabaseEvents in database.py.

Smurf specific operations are denote with methods with the smurf_ in the method name. The method DatabaseEvent.upload_data()—fully resets the database and remakes indexing operations. The method DatabaseEvent.update_data()—which uploads data to an existing MongoDB collection.

The existing function definitions at bottom of database.py may not need to be updated. These were deigned to update multiple datatypes at once. If updating multiple data types per a single event trigger is still the desire behavior, updating the methods described in the paragraph above may be sufficient.

You will need to update the the allowed post data types in at the top of the file post_status.py. Suppose a new data type called elves is being introduced, the strings "scan_elves" should be added to Python sets allowed_status_types and full_reset_types, see the image below.

image

Important

The file database.py contains the code that uploads parsed data to MongoDB.

Important

Posting a status a status to the Django SQLite database is only allowed when that status type has been defined in post_status.py at the top of the file. A future developer could choose to remove this restriction.

Metadata

Metadata

Labels

databaseAdding new data types/scheduling data tasksenhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions