Skip to content

Sensitive fields not cleaned from response body #98

@Tsdevendra1

Description

@Tsdevendra1

Description:

Summary: When returning sensitive fields in the response to any APIView, the sensitive fields do not get cleaned.

Steps to reproduce: Create TestView as below, and notice the output from the print statement

from django.shortcuts import render
from rest_framework.response import Response
from rest_framework.views import APIView
from rest_framework_tracking.mixins import LoggingMixin

# Create your views here.
class TestView(LoggingMixin, APIView):
    def get(self, request):
        return Response({"key": "password"}, status=200)

    def handle_log(self):
        super().handle_log()
        print(self.log)

Output:
{'requested_at': datetime.datetime(2023, 6, 9, 17, 20, 11, 319782, tzinfo=<UTC>), 'data': {}, 'remote_addr': <removed by myself>, 'view': 'api.views.TestView', 'view_method': 'get', 'path': '/test/articles/', 'host': '0.0.0.0:8000', 'user_agent': 'HTTPie/3.2.1', 'method': 'GET', 'query_params': {}, 'user': None, 'username_persistent': 'Anonymous', 'response_ms': 0, 'response': '{"key":"password"}', 'status_code': 200}

Expected behavior: Password should be hidden e.g.

'response': '{"key":"*******"}'

Actual behavior: Password is not hidden

'response': '{"key":"password"}'

Environment: Happens in all environments (even locally)

Additional context:

It seems the issue stems from the updating of the log/cleaning of the data. The "response" key gets rendered_content which is type <class 'bytes'>. When the data is decoded from bytes it is decoded into a string in the form '{"key":"password"}'. Since it is a string, it is not covered by any of the clean logic.


            self.log.update(
                {
                    "remote_addr": self._get_ip_address(request),
                    "view": self._get_view_name(request),
                    "view_method": self._get_view_method(request),
                    "path": self._get_path(request),
                    "host": request.get_host(),
                    "user_agent": request.META.get("HTTP_USER_AGENT", ""),
                    "method": request.method,
                    "query_params": self._clean_data(request.query_params.dict()),
                    "user": user,
                    "username_persistent": user.get_username() if user else "Anonymous",
                    "response_ms": self._get_response_ms(),
                    "response": self._clean_data(rendered_content),
                    "status_code": response.status_code,
                }
            )
    def _clean_data(self, data):
        """
        Clean a dictionary of data of potentially sensitive info before
        sending to the database.
        Function based on the "_clean_credentials" function of django
        (https://github.com/django/django/blob/stable/1.11.x/django/contrib/auth/__init__.py#L50)

        Fields defined by django are by default cleaned with this function

        You can define your own sensitive fields in your view by defining a set
        eg: sensitive_fields = {'field1', 'field2'}
        """
        if isinstance(data, bytes):
            data = data.decode(errors="replace")

        if isinstance(data, list):
            return [self._clean_data(d) for d in data]

        if isinstance(data, dict):
            SENSITIVE_FIELDS = {
                "api",
                "token",
                "key",
                "secret",
                "password",
                "signature",
            }

            data = dict(data)
            if self.sensitive_fields:
                SENSITIVE_FIELDS = SENSITIVE_FIELDS | {
                    field.lower() for field in self.sensitive_fields
                }

            for key, value in data.items():
                try:
                    value = ast.literal_eval(value)
                except (ValueError, SyntaxError):
                    pass
                if isinstance(value, (list, dict)):
                    data[key] = self._clean_data(value)
                if key.lower() in SENSITIVE_FIELDS:
                    data[key] = self.CLEANED_SUBSTITUTE
        return data

Please let me know if I'm not understanding something properly/using this wrong.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions