Skip to content

Home works 15 and 18 implementation#3

Open
Regadene wants to merge 6 commits intomainfrom
HW18
Open

Home works 15 and 18 implementation#3
Regadene wants to merge 6 commits intomainfrom
HW18

Conversation

@Regadene
Copy link
Owner

No description provided.

Copy link

@nvaulin nvaulin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Привет:

По дз 15: 23/25 (BiologicalSequence) + 23/25 (FastQ) = 46/50 * 0.5 (сдано после дедлайна) = 23

По дз 18: 10/10 + 8/8 + 32/32 = 50/50

Comment on lines +4 to +29
COMPLEMENT_DICT_DNA = {
"A": "T",
"a": "t",
"G": "C",
"g": "c",
"T": "A",
"t": "a",
"C": "G",
"c": "g",
}

# creating a dictionary for the complementary sequence
# from mRNA 5' - 3' to cDNA 3' - 5'
COMPLEMENT_DICT_RNA = {
"A": "T",
"a": "t",
"G": "C",
"g": "c",
"U": "A",
"u": "a",
"C": "G",
"c": "g",
}

# creating a dictionary for transcription DNA -> RNA (5' - 3' DNA ->
# 5' - 3' RNA)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ну кстати тут уже столько констант, что их имеет смысл вынести в отдельный файлик

Comment on lines +72 to +88
def __init__(self, sequence: str):
self.sequence = sequence

def __len__(self):
return len(self.sequence)

def __getitem__(self, index):
return self.__class__(self.sequence[index])

def __str__(self):
return self.sequence

def __repr__(self):
return f'{self.__class__.__name__}(sequence="{self.sequence}")'

def is_sequence_correct(self):
return set(self.sequence).issubset(self.alphabet)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Тут должны быть только дефенишины, без начинки функций. Абстрактные классы нужны чтобы зафиксировать наличие какого-то интерфейса, а не реализовывать его

return set(self.sequence).issubset(self.alphabet)


class NucleicAcidSequence(BiologicalSequence, ABC):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

а вот тут наследоваться от ABC уже не надо

Comment on lines +10 to +11
gc_bounds: int | list[int] = [0, 100],
length_bounds: int | list[int] = [0, 2**32],
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Нельзя делать изменяемые типы данных в качестве знаний по умолчанию! В первом семестре мы смотрели к каким проблемам это может привести

if not os.path.exists(input_fastq):
raise FileNotFoundError("Input fastq file does not exists")

if type(gc_bounds) is int and gc_bounds > 0:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

проверка на типы делается иначе)

elif type(gc_bounds) is not tuple and len(gc_bounds) != 2:
return "Wrong value for the gc_bounds argument"
elif (
type(gc_bounds) is not list
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
type(gc_bounds) is not list
not isinstance(gc_bounds, list)

А если кортеж?

"--output",
dest="output",
default=None,
type=str,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

советую посмотреть на: https://docs.python.org/3/library/argparse.html#filetype-objects

Comment on lines +1 to +3
import argparse
from bioinf_utility import filter_fastq
import logging
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
import argparse
from bioinf_utility import filter_fastq
import logging
import argparse
import logging
from bioinf_utility import filter_fastq

Comment on lines +26 to +28
result = subprocess.run(
["python", "filter_fastq_CLI.py"] + args, capture_output=True, text=True
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

круто что протестирован запуск и командной строки!

Comment on lines +38 to +40
assert (
len(records) == filtered_seqs_amount
), f"Expected {filtered_seqs_amount} sequences, got {len(records)}"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Было бы конечно хорошо еще знать какие-именно сиквенсы не прошли. Но это вопрос уже к данным. То есть тогда стоит (в реальном мире) потратиться на то чтобы подготовить набор правильных фаликов-ответов на каждый набор аргументов. И сравнивать файлики. Но это уже отдельная история

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants