Strange Python ctypes behavior. Always loads the m (math) library - python

Can someone explain to me why the following Python code works?
import ctypes
import ctypes.util
boblib = ctypes.cdll.LoadLibrary(ctypes.util.find_library("bob_is_your_uncle"))
boblib.cos.argtypes = [ctypes.c_double]
boblib.cos.restype = ctypes.c_double
print(boblib.cos(0)) # This prints out "1.0"
I am 1000% sure that there is no "bob_is_your_uncle" library on my filesystem. Yet, it seems like ctypes loads the m library. Why is this happening?
Also, if I do this: print(boblib), I get this:
<CDLL 'None', handle 7f6a80f6d170 at 0x7f6a7f34d0b8>
What does CDLL 'None' mean?
Thanks in advance.
PS: Doing a --version on both my Python interpreters I get:
Python 3.6.5rc1 and Python 2.7.14+. The above code gives the same result on both versions. My OS is Debian (Testing repo).

It's not loading the math library. It appears to be loading the Python executable itself, which has cos linked in.
There is indeed no library named bob_is_your_uncle, so find_library returns None. (That's where the None comes from in the output you're seeing.)
On Unix, the LoadLibrary logic has a specific check that translates a None name to a null pointer for the underlying dlopen routine. dlopen has special handling for a null name:
If filename is NULL, then the returned handle is for the main program.
In fact, on Unix, ctypes.pythonapi is created as
pythonapi = PyDLL(None)
explaining why the None handling is there in the first place. The CDLL object you've created is almost like ctypes.pythonapi, except that it doesn't hold the GIL for function calls (because CDLL instead of PyDLL), so it's useless for interacting with the actual C Python API.

Related

_shutdown AttributeError (ignored) when linting code that uses M2Crypto

I'm running lint as follows:
$ python -m pylint.lint m2test.py
with this code:
import M2Crypto
def f():
M2Crypto.RSA.new_pub_key("").as_pem(cipher=None).split("\n")
The lint output ends with:
Exception AttributeError: '_shutdown' in <module 'threading' from '/usr/lib/python2.7/site-packages/M2Crypto-0.21.1-py2.7-linux-x86_64.egg/M2Crypto/threading.pyc'> ignored
This code works fine when run (the above is actually a minimal test case; but the full version does work). The exception is ignored, but Bitten considers this a failure, so stops on this step.
I've tried adding 'M2Crypto.threading.init()'/'M2Crypto.threading.cleanup()' around the definition of the function, but that didn't fix the problem.
How can I prevent this problem from occurring?
I'm using M2Crypto 0.21.1, pylint 0.24 and Python 2.7 (also tried 2.7.2) on Debian Lenny x86_64.
The exception that you are seeing is caused by a bug in the astng package (presumably “Abstract Syntax Tree, Next Generation”?) which is a toolkit on which pylint depends, written by the same people. I should note in passing that I always encourage people to use pyflakes instead of pylint when possible, because it is quick, simple, fast, and predictable, whereas pylint tries to do several kinds of deep magic that are not only slow but that can get it into exactly this kind of trouble. :)
Here are the two packages on PyPI:
http://pypi.python.org/pypi/pylint
http://pypi.python.org/pypi/astng
And note that this problem had to be, necessarily, a bug in pylint and not in your code, because pylint does not run your code in order to produce its report — imagine the havoc that could be wreaked if it did (since code being linted might delete files, etcetera)! Since your code does not get run, no amount of caution, like protecting your call with threading init() or cleanup() functions, could possibly have prevented this error — unless the code snippets happened, for other reasons, to alter the behavior we are about to investigate.
So, on to your actual exception.
I had never actually heard of _shutdown before! A quick search of the Python standard library showed its definition in threading.py but not a call of the function from anywhere; only by searching the Python C source code did I discover where in pythonrun.c, during interpreter shutdown, the function is actually called:
static void
wait_for_thread_shutdown(void)
{
...
PyObject *threading = PyMapping_GetItemString(tstate->interp->modules,
"threading");
if (threading == NULL) {
/* threading not imported */
PyErr_Clear();
return;
}
result = PyObject_CallMethod(threading, "_shutdown", "");
if (result == NULL) {
PyErr_WriteUnraisable(threading);
}
...
}
Apparently it is some sort of cleanup function that the threading Standard Library module requires, and they have special-cased the Python interpreter itself to make sure that it gets called.
As you can see from the code above, Python quietly and without complaint handles the case where the threading module never gets imported during a program's run. But if threading does get imported, and still exists at shutdown time, then the interpreter looks inside for a _shutdown function and goes so far as to print an error message — and then return a non-zero exit status, the cause of your problems — if it cannot call it.
So we have to discover why the threading module exists but has no _shutdown method at the moment when pylint is done examining your program and Python is exiting. Some instrumention is called for. Can we print out what the module looks like as pylint exits? We can! The pylint/lint.py module, in its last few lines, runs its “main program” by instantiating a Run class it has defined:
if __name__ == '__main__':
Run(sys.argv[1:])
So I opened lint.py in my editor — one of the magnificent things about having each little project installed in a Python Virual Environment is that I can jump in and edit third-party code for quick experiments — and added the following print statement down at the bottom of the Run class's __init__() method:
sys.path.pop(0)
print "*****", sys.modules['threading'].__file__ # added by me!
if exit:
sys.exit(self.linter.msg_status)
I re-ran the command:
python -m pylint.lint m2test.py
And out came the __file__ string of the threading module:
***** /home/brandon/venv/lib/python2.7/site-packages/M2Crypto/threading.pyc
Well, look at that.
This is the problem!
According to this path, there actually exists an M2Crypto/threading.py module that, under all normal circumstances, should just be called M2Crypto.threading, and therefore sit in the sys.modules dictionary under the name:
sys.modules['M2Crypto.threading']
But somehow that file is also getting loaded as the main Python threading module, shadowing the official threading module that sits in the Standard Library. Because of this, the Python exit logic is quite correctly complaining that the Standard Library _shutdown() function is missing.
How could this happen? Top-level modules can only appear in paths that are listed explicitly in sys.path, not in sub-directories beneath them. This leads to a new question: is there any point during the pylint run that the …/M2Crypto/ directory itself is getting put on sys.path as though it contained top-level modules? Let's see!
We need more instrumentation: we need to have Python tell us the moment that a directory with M2Crypto in the name appears in sys.path. It will really slow things down, but let's add a trace function to pylint's __init__.py — because that is the first module that gets imported when you run -m pylint.lint — that will write an output file telling us, for every line of code executed, whether sys.path has any bad values in it:
def install_tracer():
import sys
output = open('mytracer.out', 'w')
def mytracer(frame, event, arg):
broken = any(p.endswith('M2Crypto') for p in sys.path)
output.write('{} {}:{} {}\n'.format(
broken, frame.f_code.co_filename, frame.f_lineno, event))
return mytracer
sys.settrace(mytracer)
install_tracer()
del install_tracer
Note how careful I am here: I define only one name in the module's namespace, and then carefully delete it to clean up after myself before I let pylint continue loading! And all of the resources that the trace function itself needs — namely, the sys module and the output open file — are available in the install_tracer() closure so that, from the outside, pylint looks exactly the same as always. Just in case anyone tries to introspect it, like pylint might!
This generates a file mytracer.out of about 800k lines, that each look something like this:
False /home/brandon/venv/lib/python2.7/posixpath.py:118 call
The False says that sys.path looks clean, the filename and line number are the line of code being executed, and call indicates what stage of execution the interpreter is in.
So does sys.path ever get poisoned? Let's look at just the first True or False on each line, and see how many successive lines start with each value:
$ awk '{print$1}' mytracer.out | uniq -c
607997 False
3173 True
4558 False
33217 True
4304 False
41699 True
2953 False
110503 True
52575 False
Wow! That's a problem! For runs of several thousand lines at a time, our test case is True, which means that the interpreter is running with …/M2Crypto/ — or some variant of a pathname with M2Crypto in it — on the path, where it should not be; only the directory that contains …/M2Crypto should ever be on the path. Looking for the first False to True transition in the file, I see this:
False /home/brandon/venv/lib/python2.7/site-packages/logilab/astng/builder.py:132 line
False /home/brandon/venv/lib/python2.7/posixpath.py:118 call
...
False /home/brandon/venv/lib/python2.7/posixpath.py:124 line
False /home/brandon/venv/lib/python2.7/posixpath.py:124 return
True /home/brandon/venv/lib/python2.7/site-packages/logilab/astng/builder.py:133 line
And looking at lines 132 and 133 in the builder.py file reveals our culprit:
130 # build astng representation
131 try:
132 sys.path.insert(0, dirname(path)) # XXX (syt) iirk
133 node = self.string_build(data, modname, path)
134 finally:
135 sys.path.pop(0)
Note the comment, which is part of the original code, not an addition of my own! Obviously, XXX (syt) iirk is an exclamation in this programmer's strange native language for the phrase, “put this module's parent directory on sys.path so that pylint will break mysteriously every time someone forces pylint to introspect a package with a threading sub-module.” It is, obviously, a very compact native language. :)
If you adjust the tracing module to watch sys.modules for the actual import of threading — an exercise I will leave to the reader — you will see that it happens when SocketServer, which is imported by some other Standard Library module during the analysis, in turn tries to innocently import threading.
So let us review what is happening:
pylint is dangerous magic.
As part of its magic, if it sees you import foo, then it runs off trying to find foo.py on disk, to parse it, and to predict whether you are loading valid or invalid names from its namespace.
[See my comment, below.] Because you call .split() on the return value of RSA.as_pem(), pylint tries to introspect the as_pem() method, which in turn uses the M2Crypto.BIO module, which in turn makes calls that induce pylint to import threading.
As part of loading any module foo.py, pylint throws the directory containing foo.py on sys.path, even if that directory is inside a package, and therefore gives modules in that directory the privilege of shadowing Standard Library modules of the same name during its analysis.
When Python exits, it is upset that the M2Crypto.threading library is sitting where threading belongs, because it wants to run the _shutdown() method of threading.
You should report this as a bug to the pylint / astng folks at logilab.org. Tell them I sent you.
If you decide to keep using pylint after it has done this to you, then there seem to be two solutions in this case: either don't inspect code that calls M2Crypto, or import threading during the pylint import process — by sticking import threading into the pylint/__init__.py, for example — so that the module gets the chance to grab the sys.modules['threading'] slot before pylint gets all excited and tries to let M2Crypto/threading.py grab the slot instead.
In conclusion, I think the author of astng says it best: XXX (syt) iirk. Indeed.
Many thanks to Brandon Craig Rhodes for having tracing this down and for such a detailed post.
I've removed the offending line from astng, code available from the hg repository until logilab-astng 0.23.0 is out. And I can confirm this fixes the OP's pb.
This looks more like a hack but I think it works. Copying the result of "as_pem()" and splitting it.
import M2Crypto
def f():
M2Crypto.RSA.new_pub_key("").as_pem(cipher=None)[:].split("\n")
I'm using Python 2.6.7, M2Crypto 0.21.1, pylint 0.23
I was unable to reproduce (pylint 0.24 and M2Crypto 0.21.1 on Ubuntu 11.04 64bit) but two suggestions:
Explicitly initialize threading:
import M2Crypto
def f():
M2Crypto.threading.init()
M2Crypto.RSA.new_pub_key("").as_pem(cipher=None).split("\n")
M2Crypto.threading.cleanup()
Or recompile without threading:
m2crypto = Extension(name = 'M2Crypto.__m2crypto',
sources = ['SWIG/_m2crypto.i'],
extra_compile_args = ['-DTHREADING'],
#extra_link_args = ['-Wl,-search_paths_first'], # Uncomment to build Universal Mac binaries
)

Import in python 3 complains about argument as a str / bytes

I've being updating a quaternions package for integration with numpy, so that it can be used in both python 2 and python 3. Unfortunately, the basic import step fails miserably with 3.x, though it has never failed with python 2.7. (I use python2.7 to compile the 2.7 version, and python3.x to compile the 3.x versions. It's a really simple distutils thing.) The error message doesn't even appear in google's results, and I just have no idea where to go from here.
Here is the complete output from a simple attempt to import the package:
> python -c 'import quaternion'
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/Users/mynamehere/.continuum/anaconda/envs/py3k/lib/python3.4/site-packages/quaternion/__init__.py", line 3, in <module>
from .numpy_quaternion import quaternion
TypeError: __import__() argument 1 must be str, not bytes
As the error message says, there is a line in __init__.py saying
from .numpy_quaternion import quaternion
But why should that be problematic? There is a file numpy_quaternion.so in the same directory as the __init__.py file, which seems to contain the relevant symbols. Travis-CI shows that it works just fine in 2.7 (and the other tests pass), but fails in 3.2 and 3.4. So it's not just something wrong with my python installation. I tried to remove the . for the relative import, but python couldn't find the numpy_quaternion from which to import (not surprising). I tried changing it to from quaternion.numpy_quaternion, but I get the same error.
I see that there have been changes to the import system in python 3, but if anything, I would have guessed that this would be more py3k-compliant than other ways of doing it. What's going wrong? How can I get this to work?
Just to clarify, my hierarchy looks like this:
.../site-packages/
quaternion/
__init__.py
numpy_quaternion.so
and the only thing that comes before the problematic line is import numpy as np, which generally succeeds with no problem.
The python-list people got back to me right away with excellent suggestions. Turns out I was importing something within numpy_quaternion.so (using the c-api), but the argument I was giving to that function was wrong. I was (basically) using code from a similar package:
PyObject* numpy_str = PyString_FromString("numpy");
PyObject* numpy = PyImport_Import(numpy_str);
I fixed it by using
PyObject* numpy = PyImport_ImportModule("numpy");
And as J. F. Sebastian points out in the comments, the reason that was going wrong for me was because that PyString_FromString was just a #define for the wrong function when I was using python 3.
Since it is easy, I would first try an absolute import, though if my guesses below are correct, this will not work.
from quaternion.numpy_quaternion import quaternion
From your post, I am guessing that your hierarchy looks like
.../Libe/site-packages
quaternion
__init__.py
numpy_quaternion.so
quaternion # a symbol in .so, not a .py
and that quaterion is a module, rather than a function or class. I am guessing this because I cannot imagine 'numpy_quaternion' becoming bytes, while the .so must return 'quaternion' as bytes for 2.7 to work, so maybe it is doing the same with 3.x. My unix experience predates Python. But my impression is that separate .so are needed for 2.x and 3.x. Or if not, certain compile flags might be needed. If I am correct, you need to add 'numpy_quaternion_3x.so to your package and switch the import on sys.version[0].
If you do not get more response here, try python-list, easily accessed at news.gmane.com as newsgroup mirror gmane.comp.python.general. The regular responders include some savvy linux users.

Python: where is the code for os.mkdir?

I've been looking through the code of the os module (just to be clear, I'm looking at the file /usr/lib/python2.7/os.py), and I've been trying to find the code for the mkdir function. From what I could tell, it comes from the 'posix' module, and its a built-in function, same as range or max:
>>> import posix
>>> posix.mkdir
<built-in function mkdir>
>>> max
<built-in function max>
I'm guessing the code for these is written in C somewhere, and the python interpreter knows where to find them. Could someone explain, or point me to some resources that do, how and where these built-in function are written and how they are integrated with the interpreter?
Thanks!
On POSIX platforms (and on Windows and OS/2) the os module imports from a C module, defined in posixmodule.c.
This module defines a posix_mkdir() function that wraps the mkdir() C call on POSIX platforms, CreateDirectoryW on Windows.
The module registers this function, together with others, in the module PyMethodDef posix_methods structure. When the module is imported, Python calls the PyMODINIT_FUNC() function, which uses that structure to create an approriate module object with the posix_methods structure and adds a series of constants (such as the open() flag constants) to the module.
See the Extending Python with C or C++ tutorial on how C extensions work.

How to load compiled python modules from memory?

I need to read all modules (pre-compiled) from a zipfile (built by py2exe compressed) into memory and then load them all.
I know this can be done by loading direct from the zipfile but I need to load them from memory.
Any ideas? (I'm using python 2.5.2 on windows)
TIA Steve
It depends on what exactly you have as "the module (pre-compiled)". Let's assume it's exactly the contents of a .pyc file, e.g., ciao.pyc as built by:
$ cat>'ciao.py'
def ciao(): return 'Ciao!'
$ python -c'import ciao; print ciao.ciao()'
Ciao!
IOW, having thus built ciao.pyc, say that you now do:
$ python
Python 2.5.1 (r251:54863, Feb 6 2009, 19:02:12)
[GCC 4.0.1 (Apple Inc. build 5465)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> b = open('ciao.pyc', 'rb').read()
>>> len(b)
200
and your goal is to go from that byte string b to an importable module ciao. Here's how:
>>> import marshal
>>> c = marshal.loads(b[8:])
>>> c
<code object <module> at 0x65188, file "ciao.py", line 1>
this is how you get the code object from the .pyc binary contents. Edit: if you're curious, the first 8 bytes are a "magic number" and a timestamp -- not needed here (unless you want to sanity-check them and raise exceptions if warranted, but that seems outside the scope of the question; marshal.loads will raise anyway if it detects a corrupt string).
Then:
>>> import types
>>> m = types.ModuleType('ciao')
>>> import sys
>>> sys.modules['ciao'] = m
>>> exec c in m.__dict__
i.e: make a new module object, install it in sys.modules, populate it by executing the code object in its __dict__. Edit: the order in which you do the sys.modules insertion and exec matters if and only if you may have circular imports -- but, this is the order Python's own import normally uses, so it's better to mimic it (which has no specific downsides).
You can "make a new module object" in several ways (e.g., from functions in standard library modules such as new and imp), but "call the type to get an instance" is the normal Python way these days, and the normal place to obtain the type from (unless it has a built-in name or you otherwise have it already handy) is from the standard library module types, so that's what I recommend.
Now, finally:
>>> import ciao
>>> ciao.ciao()
'Ciao!'
>>>
...you can import the module and use its functions, classes, and so on. Other import (and from) statements will then find the module as sys.modules['ciao'], so you won't need to repeat this sequence of operations (indeed you don't need this last import statement here if all you want is to ensure the module is available for import from elsewhere -- I'm adding it only to show it works;-).
Edit: If you absolutely must import in this way packages and modules therefrom, rather than "plain modules" as I just showed, that's doable, too, but a bit more complicated. As this answer is already pretty long, and I hope you can simplify your life by sticking to plain modules for this purpose, I'm going to shirk that part of the answer;-).
Also note that this may or may not do what you want in cases of "loading the same module from memory multiple times" (this rebuilds the module each time; you might want to check sys.modules and just skip everything if the module's already there) and in particular when such repeated "load from memory" occurs from multiple threads (needing locks -- but, a better architecture is to have a single dedicated thread devoted to performing the task, with other modules communicating with it via a Queue).
Finally, there's no discussion of how to install this functionality as a transparent "import hook" which automagically gets involved in the mechanisms of the import statement internals themselves -- that's feasible, too, but not exactly what you're asking about, so here, too, I hope you can simplify your life by doing things the simple way instead, as this answer outlines.
Compiled Python file consist of
magic number (4 bytes) to determine type and version of Python,
timestamp (4 bytes) to check whether we have newer source,
marshaled code object.
To load module you have to create module object with imp.new_module(), execute unmashaled code in new module's namespace and put it in sys.modules. Below in sample implementation:
import sys, imp, marshal
def load_compiled_from_memory(name, filename, data, ispackage=False):
if data[:4]!=imp.get_magic():
raise ImportError('Bad magic number in %s' % filename)
# Ignore timestamp in data[4:8]
code = marshal.loads(data[8:])
imp.acquire_lock() # Required in threaded applications
try:
mod = imp.new_module(name)
sys.modules[name] = mod # To handle circular and submodule imports
# it should come before exec.
try:
mod.__file__ = filename # Is not so important.
# For package you have to set mod.__path__ here.
# Here I handle simple cases only.
if ispackage:
mod.__path__ = [name.replace('.', '/')]
exec code in mod.__dict__
except:
del sys.modules[name]
raise
finally:
imp.release_lock()
return mod
Update: the code is updated to handle packages properly.
Note that you have to install import hook to handle imports inside loaded modules. One way to do this is adding your finder into sys.meta_path. See PEP302 for more information.

fcntl substitute on Windows

I received a Python project (which happens to be a Django project, if that matters,) that uses the fcntl module from the standard library, which seems to be available only on Linux. When I try to run it on my Windows machine, it stops with an ImportError, because this module does not exist here.
Is there any way for me to make a small change in the program to make it work on Windows?
The substitute of fcntl on windows are win32api calls. The usage is completely different. It is not some switch you can just flip.
In other words, porting a fcntl-heavy-user module to windows is not trivial. It requires you to analyze what exactly each fcntl call does and then find the equivalent win32api code, if any.
There's also the possibility that some code using fcntl has no windows equivalent, which would require you to change the module api and maybe the structure/paradigm of the program using the module you're porting.
If you provide more details about the fcntl calls people can find windows equivalents.
The fcntl module is just used for locking the pinning file, so assuming you don't try multiple access, this can be an acceptable workaround. Place this module in your PYTHONPATH, and it should just work as the official fcntl module.
Try using this module for development/testing purposes only in windows.
def fcntl(fd, op, arg=0):
return 0
def ioctl(fd, op, arg=0, mutable_flag=True):
if mutable_flag:
return 0
else:
return ""
def flock(fd, op):
return
def lockf(fd, operation, length=0, start=0, whence=0):
return
Although this does not help you right away, there is an alternative that can work with both Unix (fcntl) and Windows (win32 api calls), called: portalocker
It describes itself as a cross-platform (posix/nt) API for flock-style file locking for Python. It basically maps fcntl to win32 api calls.
The original code at http://code.activestate.com/recipes/65203/ can now be installed as a separate package - https://pypi.python.org/pypi/portalocker

Resources