Skip to content

Package inappropriately asks for elevated permissions #22

@BlueNalgene

Description

@BlueNalgene

Basics

OS: Kubuntu 22.04.3 LTS x86_64 
Host: TITAN Standard 
Kernel: 5.15.0-86-generic 
Uptime: 6 days, 2 hours, 12 mins 
Packages: 3693 (dpkg), 12 (flatpak), 17 (snap) 
Shell: bash 5.1.16 
Resolution: 3840x2160, 3840x2160, 2560x1440 
DE: Plasma 5.24.7 
WM: KWin 
Theme: [Plasma], Breeze [GTK2/3] 
Icons: [Plasma], breeze-dark [GTK2/3] 
Terminal: konsole 
Terminal Font: Hack 12 
CPU: AMD Ryzen 9 5900HX with Radeon Graphics (16) @ 3.300GHz 
GPU: AMD ATI 06:00.0 Cezanne 
GPU: NVIDIA GeForce RTX 3070 Mobile / Max-Q 
Memory: 12008MiB / 31487MiB 

Package installed from pip.

>>> print(mk.__version__)
'3.4.1'

Description

When writing a CSV, I get an unexpected permissions error. It looks like the package is attempting to download a file directly to a write-protected location on my drive where Python packages are stored.

Code to Reproduce

Using your "savedrecs.txt" file in the examples:

import metaknowledge as mk
records = mk.RecordCollection("/home/wes/Downloads/savedrecs.txt")
RCfiltered = records.copy()
or each in records:
     if each.title[0] != 'A':
         RCfiltered.discard(each)
RCfiltered.writeFile("/home/wes/Downloads/Records_Starting_with_A.txt") # works
RCfiltered.writeCSV("/home/wes/Downloads/Records_Starting_with_A.csv") # errors

The error I see after the last line is:

>>> RCfiltered.writeCSV("/home/wes/Downloads/Records_Starting_with_A.csv")
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/metaknowledge/genders/nameGender.py", line 38, in downloadData
    with open(targetFilePath, 'wb') as f:
PermissionError: [Errno 13] Permission denied: '/usr/local/lib/python3.10/dist-packages/metaknowledge/genders/namesData.csv'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.10/dist-packages/metaknowledge/recordCollection.py", line 353, in writeCSV
    recDict['num-Male'], recDict['num-Female'], recDict['num-Unknown'] = R.authGenders(_countsTuple = True)
  File "/usr/local/lib/python3.10/dist-packages/metaknowledge/mkRecord.py", line 680, in authGenders
    authDict = recordGenders(self)
  File "/usr/local/lib/python3.10/dist-packages/metaknowledge/genders/nameGender.py", line 69, in recordGenders
    return {auth : nameStringGender(auth, noExcept = True) for auth in R.get('authorsFull', [])}
  File "/usr/local/lib/python3.10/dist-packages/metaknowledge/genders/nameGender.py", line 69, in <dictcomp>
    return {auth : nameStringGender(auth, noExcept = True) for auth in R.get('authorsFull', [])}
  File "/usr/local/lib/python3.10/dist-packages/metaknowledge/genders/nameGender.py", line 65, in nameStringGender
    mappingDict = getMapping()
  File "/usr/local/lib/python3.10/dist-packages/metaknowledge/genders/nameGender.py", line 45, in getMapping
    downloadData(useUK)
  File "/usr/local/lib/python3.10/dist-packages/metaknowledge/genders/nameGender.py", line 41, in downloadData
    raise PermissionError("Can not write to {}, you try rerunning with higher privileges".format(targetFilePath))
PermissionError: Can not write to /usr/local/lib/python3.10/dist-packages/metaknowledge/genders/namesData.csv, you try rerunning with higher privileges

The Problem

Investigating, it looks like this was hard-coded into the nameGender.py file, which is strange. A Python package should be reproducible on download and deterministic during install by explicit declaration. See this related stack overflow question. If these name data ratios are required for a package which is publicly distributed in PIP, you should consider following PEP 508 protocols and create a sub-package which is installed as a dependency.

Honestly, I got a bit nervous when I saw this package was trying to download something to a write-protected location and requesting elevated permissions. It feels like something which could be abused to convince users to download unsafe files. I doubt there is ill intention, but it seems surprising given how heavily used this is as an analysis tool. At the very least, some documentation on the logic and how to use it would be nice.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions