python-magic
A python wrapper for libmagic
I have a simple problem: in a system I'm developing the user can send us zipfiles and I need to filter the content of it. (block applications and malicious scripts)
To block the inner files by extension is easy, but files without extension are very common and the extension isn't the most reliable source about the content of the file.
I've already tried to use python magic, but it requires some packages that my server doesn't support and the server isn't going to help me.
Oh! I don't have the option of changing the system to another server. So, there's no python magic for me in this case.
Does anyone have an idea of how to check the file type by its header?
Source: (StackOverflow)
I am using s3cmd to upload files.
and it always upload .png file as "image/x-png" for MIME type.
So I decide to install "python-magic"
What I did here:
Installed Python 2.7 x86 on Windows 7 64bit, (since the manual of "python-magic" said only x86 will work) download from http://www.python.org/download/releases/2.7/
Installed Python Extention Setuptools http://www.lfd.uci.edu/~gohlke/pythonlibs/#setuptools
Download & Installed https://github.com/ahupp/python-magic , use 'C:\Python27\python setup.py install'
Found the 3 files ( magic1.dll, zlib1.dll, regex2.dll, as required by python-magic ) and copied to Windows/System32
ok, at last s3cmd is using python-magic for MIME, BUT errors:
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
An unexpected error has occurred.
Please report the following lines to:
s3tools-bugs@lists.sourceforge.net
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Problem: MagicException: could not find any magic files!
S3cmd: 1.1.0-beta3
Traceback (most recent call last):
File "T:\My\Downloads\s3cmd\s3cmd", line 1788, in <module>
from S3.S3 import S3
File "T:\My\Downloads\s3cmd\S3\S3.py", line 35, in <module>
magic_ = magic.Magic(mime=True)
File "build\bdist.win32\egg\magic.py", line 51, in __init__
magic_load(self.cookie, magic_file)
File "build\bdist.win32\egg\magic.py", line 138, in errorcheck
raise MagicException(err)
Please advice how or where can I get some magic files.
Source: (StackOverflow)
This question already has an answer here:
I've downloaded and installed python-magic using "pip install python-magic".
Source: https://github.com/ahupp/python-magic
It downloaded and installed perfectly fine. I've also copied the 3 files (cygmagic-1.dll, cygwin1.dll, and cygz.dll) from cygwin installation into C:\Windows\System32.
Then, I also downloaded magic1.dll and placed it in System32 folder too.
But the command prompt is still giving me this error:
ImportError: failed to find libmagic. Check your installation
Why is this so?
EDIT: I've included C:\cygwin\bin into PATH also.
Source: (StackOverflow)
I followed the instructions here...
https://github.com/ahupp/python-magic#dependencies
I ran pip install python-magic and it installed without any issues. Then I installed cygwin and added C:\cygwin\bin to my system path. When I run the python interpreter in a Windows command prompt and import magic I get this error...
Python 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] on win
32
Type "help", "copyright", "credits" or "license" for more information.
>>> import magic
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python27\lib\site-packages\magic.py", line 161, in <module>
raise ImportError('failed to find libmagic. Check your installation')
ImportError: failed to find libmagic. Check your installation
>>>
Did I miss a step?
Source: (StackOverflow)
Here's what I get when I call magic.from_buffer
:
>>> import urllib2
>>> data = urllib2.urlopen('http://www.in.gov/judiciary/opinions/previous/wpd/05040501.bed.doc').read()
>>> len(data)
29696
>>> from lib import magic
>>> magic.from_buffer(data, mime=True)
At this point, I should be provided with application/msword
, but instead I get nothing from the last call. What am I missing?
This works on my dev machine, but fails on my server. I'm fairly baffled.
Source: (StackOverflow)
I need to determine MIME-types from files without suffix in python3 and I thought of python-magic as a fitting solution therefor.
Unfortunately it does not work as described here:
https://github.com/ahupp/python-magic/blob/master/README.md
What happens is this:
>>> import magic
>>> magic.from_file("testdata/test.pdf")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'module' object has no attribute 'from_file'
So I had a look at the object, which provides me with the class Magic
for which I found documentation here:
http://filemagic.readthedocs.org/en/latest/guide.html
I was surprised, that this did not work either:
>>> with magic.Magic() as m:
... pass
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: __init__() missing 1 required positional argument: 'ms'
>>> m = magic.Magic()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: __init__() missing 1 required positional argument: 'ms'
>>>
I could not find any information about how to use the class Magic
anywhere, so I went on doing trial and error, until I figured out, that it accepts instances of LP_magic_set
only for ms
.
Some of them are returned by the module's methods
magic.magic_set()
and magic_t()
.
So I tried to instanciate Magic
with either of them.
When I then call the file()
method from the instance, it will always return an empty result and the errlvl()
method tells me error no. 22.
So how do I use magic anyway?
Source: (StackOverflow)
I am facing the problem to find out which is the file-type behind a file-handler.
I need this because my apache_log_parser failed to parse a line and the whole program bumped out:
Traceback (most recent call last): File "VirtualEnvs/moslog/bin/mosloganalisys.py", line 108, in
<module>
totalines = count_agent(logfilehandler,agentcount,totalines) File "VirtualEnvs/moslog/bin/mosloganalisys.py", line
27, in count_agent
log_line_data = line_parser(line) File "VirtualEnvs/moslog/lib/python2.7/site-packages/apache_log_parser/__init__.py",
line 225, in parse
raise LineDoesntMatchException(log_line=log_line, regex=self.log_line_regex.pattern)
The reason was that the file handler was pointing to a gz file. No matter if I used the gzip library to decompress the file because this was a double compressed file *.gz.gz and therefore the decompressed file was in turn another gziped file.
So I try to use the python-magic library to find out the file type but it seems that a filename is needed.
72 """
73 self._thread_check()
---> 74 if not os.path.exists(filename):
75 raise IOError("File does not exist: " + filename)
76
/usr/lib64/python2.7/genericpath.pyc in exists(path)
16 """Test whether a path exists. Returns False for broken symbolic links"""
17 try:
---> 18 os.stat(path)
19 except os.error:
20 return False
I already implement a try: / expect: statement but this doesn't really solve the problem of processing a lot of useless lines.
What do you suggest to do?
Thanks
Source: (StackOverflow)
I have an application where users should be able to upload a wide variety of files, but I need to know for each file, if I can safely display its textual representation as plain text.
Using python-magic like
m = Magic(mime=True).from_buffer(cgi.FieldStorage.file.read())
gives me the correct MIME type.
But sometimes, the MIME type for scripts is application/*
, so simply looking for m.startswith('text/')
is not enough.
Another site suggested using
m = Magic().from_buffer(cgi.FieldStorage.file.read())
and checking for 'text' in m
.
Would the second approach be reliable enough for a collection of arbitrary file uploads or could someone give me another idea?
Thanks a lot.
Source: (StackOverflow)
I installed python-magic (0.4.6) on my Win 7 64bit using pip.
I then installed cygwin 1.7.33-2 to provide the needed dlls and created a copy of cygmagic-1.dll named magic1.dll (see
When I run the Python 2.7.6 32bit shell, the "import magic" works fine.
However, a
magic.from_file('c:\user\username\sample.txt')
gives me a
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "c:\Python27\lib\site-packages\magic.py", line 119, in from_file
m = _get_magic_type(mime)
File "c:\Python27\lib\site-packages\magic.py", line 107, in _get_magic_type
i = instances.__dict__[mime] = Magic(mime=mime)
File "c:\Python27\lib\site-packages\magic.py", line 55, in __init__
self.cookie = magic_open(flags)
WindowsError: exception: access violation writing 0x00000000
Any ideas what causes the this error and how I can fix it?
Thank you for your help!
Source: (StackOverflow)
I have written an email form class in Django with a FileField. I want to check the uploaded file for its type via checking its mimetype. Subsequently, I want to limit file types to pdfs, word, and open office documents.
To this end, I have installed python-magic and would like to check file types as follows per the specs for python-magic:
mime = magic.Magic(mime=True)
file_mime_type = mime.from_file('address/of/file.txt')
However, recently uploaded files lack addresses on my server. I also do not know of any method of the mime object akin to "from_file_content" that checks for the mime type given the content of the file.
What is an effective way to use magic to verify file types of uploaded files in Django forms?
Source: (StackOverflow)
I am writing a script to determine if a file is a valid MP3 using python-magic
. With some files, the magic.from_file()
function returns use count (30) exceeded
. Is it possible to raise the limit similar to the command line program: file --parameter name=40
? If this is not possible with python-magic
, is it possible with filemagic
?
Source: (StackOverflow)
I am trying to install python-magic for Windows and I have followed all the instructions in https://github.com/ahupp/python-magic and repeated the process several times but I am still getting this error:
ImportError: failed to find libmagic. Check your installation
I have magic1.dll (along with the two other files the docs specified) in C:\Windows\System32 so I am not sure what the issue is. I would appreciate any help or workarounds.
Source: (StackOverflow)
I am trying to write a function debug decorator that will look at:
def foo(baz):
bar = 1
bar = 2
return bar
and wrap it to:
def foo(baz):
bar = 1
print 'bar: {}'.format(bar)
bar = 2
print 'bar: {}'.format(bar)
return bar
I need to play with the function as text, to grab "\w+(?=\s*[=])", but do not know how to access that. I have a decorator I modified from a blog that works, but I just tried changing it to:
class decorator_string_check(object):
def __init__(self, func):
self.func = func
wraps(func)(self)
def __call__(self, *args, **kwargs):
print dir(self.func)
print dir(self.func.__code__)
print self.func.__code__.__str__()
ret = self.func(*args, **kwargs)
return ret
@decorator_string_check
def fake(x):
y = 6
y = x
return y
y = fake(9)
and am getting nothinng of value, namely:
['__call__', '__class__', '__closure__', '__code__', '__defaults__', '__delattr__', '__dict__', '__doc__', '__format__', '__get__', '__getattribute__', '__globals__', '__hash__', '__init__', '__module__', '__name__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'func_closure', 'func_code', 'func_defaults', 'func_dict', 'func_doc', 'func_globals', 'func_name']
['__class__', '__cmp__', '__delattr__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__le__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'co_argcount', 'co_cellvars', 'co_code', 'co_consts', 'co_filename', 'co_firstlineno', 'co_flags', 'co_freevars', 'co_lnotab', 'co_name', 'co_names', 'co_nlocals', 'co_stacksize', 'co_varnames']
<code object fake at 0x7f98b8b1d030, file "./logging.py", line 48>
How do I work with the actual "func" text, to run regexes on it and find things I need within a decorator class object? Thank you
Source: (StackOverflow)
I need to get mime type for some files on windows, so i've installed python-magic
(on 32-bit python 2.7.3).
It depends on unix magic
library.
Author instructs to get regex2.dll
, zlib1.dll
and magic1.dll
from gnuwin32 project.
So i saved the files to a folder and added the folder to my system PATH
.
Now when i execute magic
methods, i get missing file exception:
import magic
print(magic.Magic())
Traceback (most recent call last):
File "C:/Users/Admin/PycharmProjects/lex/lex.py", line 367, in <module>
test_magic()
File "C:/Users/Admin/PycharmProjects/lex/lex.py", line 364, in test_magic
print(magic.Magic())
File "C:\Python27\lib\site-packages\python_magic-0.4.3-py2.7.egg\magic.py", line 52, in __init__
magic_load(self.cookie, magic_file)
File "C:\Python27\lib\site-packages\python_magic-0.4.3-py2.7.egg\magic.py", line 188, in magic_load
return _magic_load(cookie, coerce_filename(filename))
File "C:\Python27\lib\site-packages\python_magic-0.4.3-py2.7.egg\magic.py", line 139, in errorcheck
raise MagicException(err)
magic.MagicException: could not find any magic files!
DLLs are in the PATH, i tried debugging and magic1.dll
is located correctly, but somewhere inside library throws an exception.
Inside the gnuwin32
package i've found magic
and magic.mgc
. I placed them to the same folder, and got WindowsError: [Error 126]
on
libmagic = None
# Let's try to find magic or magic1
dll = ctypes.util.find_library('magic') or ctypes.util.find_library('magic1')
# This is necessary because find_library returns None if it doesn't find the library
if dll:
libmagic = ctypes.CDLL(dll)
This obviously happens because python tries to open magic
file as dll, which is plain text. After adding .dll
to filenames in the code i get the same magic.MagicException: could not find any magic files!
.
What files am i missing?
UPDATE:
C:\Users\Admin>file C:\123.zip -m magic
file: could not find any magic files!
C:\Users\Admin>file C:\123.zip -m "C:\@DEV\@LIB\@Magic\GetGnuWin32\bin\magic"
C:\123.zip; ASCII text, with no line terminators
C:\Users\Admin>cd C:\@DEV\@LIB\@Magic\GetGnuWin32\bin
C:\@DEV\@LIB\@Magic\GetGnuWin32\bin>file C:\123.zip -m magic
C:\123.zip; ASCII text, with no line terminators
UPDATE 2 (SOLVED):
print(magic.Magic())
magic.MagicException: could not find any magic files!
print(magic.Magic(magic_file = 'magic'))
<magic.Magic instance at 0x02A5E198>
just had to specify file explicitly
Source: (StackOverflow)