EzDevInfo.com

python-magic

A python wrapper for libmagic

How can I find the mimetype of a file without extension that is sent to me inside a zip file?

I have a simple problem: in a system I'm developing the user can send us zipfiles and I need to filter the content of it. (block applications and malicious scripts)

To block the inner files by extension is easy, but files without extension are very common and the extension isn't the most reliable source about the content of the file.

I've already tried to use python magic, but it requires some packages that my server doesn't support and the server isn't going to help me. Oh! I don't have the option of changing the system to another server. So, there's no python magic for me in this case.

Does anyone have an idea of how to check the file type by its header?


Source: (StackOverflow)

How to Solve MagicException: "could not find any magic files" in s3cmd?

I am using s3cmd to upload files. and it always upload .png file as "image/x-png" for MIME type.

So I decide to install "python-magic"

What I did here:

Installed Python 2.7 x86 on Windows 7 64bit, (since the manual of "python-magic" said only x86 will work) download from http://www.python.org/download/releases/2.7/

Installed Python Extention Setuptools http://www.lfd.uci.edu/~gohlke/pythonlibs/#setuptools

Download & Installed https://github.com/ahupp/python-magic , use 'C:\Python27\python setup.py install'

Found the 3 files ( magic1.dll, zlib1.dll, regex2.dll, as required by python-magic ) and copied to Windows/System32

ok, at last s3cmd is using python-magic for MIME, BUT errors:

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    An unexpected error has occurred.
  Please report the following lines to:
   s3tools-bugs@lists.sourceforge.net
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Problem: MagicException: could not find any magic files!
S3cmd:   1.1.0-beta3

Traceback (most recent call last):
  File "T:\My\Downloads\s3cmd\s3cmd", line 1788, in <module>
    from S3.S3 import S3
  File "T:\My\Downloads\s3cmd\S3\S3.py", line 35, in <module>
    magic_ = magic.Magic(mime=True)
  File "build\bdist.win32\egg\magic.py", line 51, in __init__
    magic_load(self.cookie, magic_file)
  File "build\bdist.win32\egg\magic.py", line 138, in errorcheck
    raise MagicException(err)

Please advice how or where can I get some magic files.


Source: (StackOverflow)

Advertisements

Python-Magic ; Unable to find libmagic [duplicate]

This question already has an answer here:

I've downloaded and installed python-magic using "pip install python-magic". Source: https://github.com/ahupp/python-magic

It downloaded and installed perfectly fine. I've also copied the 3 files (cygmagic-1.dll, cygwin1.dll, and cygz.dll) from cygwin installation into C:\Windows\System32.

Then, I also downloaded magic1.dll and placed it in System32 folder too.

But the command prompt is still giving me this error:

ImportError: failed to find libmagic. Check your installation

Why is this so?

EDIT: I've included C:\cygwin\bin into PATH also.


Source: (StackOverflow)

error message using python magic on Windows XP

I followed the instructions here...

https://github.com/ahupp/python-magic#dependencies

I ran pip install python-magic and it installed without any issues. Then I installed cygwin and added C:\cygwin\bin to my system path. When I run the python interpreter in a Windows command prompt and import magic I get this error...

Python 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] on win
32
Type "help", "copyright", "credits" or "license" for more information.
>>> import magic
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python27\lib\site-packages\magic.py", line 161, in <module>
    raise ImportError('failed to find libmagic.  Check your installation')
ImportError: failed to find libmagic.  Check your installation
>>>

Did I miss a step?


Source: (StackOverflow)

Why is magic.from_buffer returning None?

Here's what I get when I call magic.from_buffer:

>>> import urllib2
>>> data = urllib2.urlopen('http://www.in.gov/judiciary/opinions/previous/wpd/05040501.bed.doc').read()
>>> len(data)
29696
>>> from lib import magic
>>> magic.from_buffer(data, mime=True)

At this point, I should be provided with application/msword, but instead I get nothing from the last call. What am I missing?

This works on my dev machine, but fails on my server. I'm fairly baffled.


Source: (StackOverflow)

How to use python-magic 5.19-1

I need to determine MIME-types from files without suffix in python3 and I thought of python-magic as a fitting solution therefor. Unfortunately it does not work as described here: https://github.com/ahupp/python-magic/blob/master/README.md

What happens is this:

>>> import magic
>>> magic.from_file("testdata/test.pdf")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'module' object has no attribute 'from_file'

So I had a look at the object, which provides me with the class Magic for which I found documentation here: http://filemagic.readthedocs.org/en/latest/guide.html

I was surprised, that this did not work either:

>>> with magic.Magic() as m:
...     pass
... 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: __init__() missing 1 required positional argument: 'ms'
>>> m = magic.Magic()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: __init__() missing 1 required positional argument: 'ms'
>>> 

I could not find any information about how to use the class Magic anywhere, so I went on doing trial and error, until I figured out, that it accepts instances of LP_magic_set only for ms. Some of them are returned by the module's methods magic.magic_set() and magic_t(). So I tried to instanciate Magic with either of them. When I then call the file() method from the instance, it will always return an empty result and the errlvl() method tells me error no. 22. So how do I use magic anyway?


Source: (StackOverflow)

Find the file-type behind a file-handler in python

I am facing the problem to find out which is the file-type behind a file-handler.

I need this because my apache_log_parser failed to parse a line and the whole program bumped out:

Traceback (most recent call last):   File "VirtualEnvs/moslog/bin/mosloganalisys.py", line 108, in
 <module>
     totalines = count_agent(logfilehandler,agentcount,totalines)   File "VirtualEnvs/moslog/bin/mosloganalisys.py", line
 27, in count_agent
     log_line_data = line_parser(line)   File "VirtualEnvs/moslog/lib/python2.7/site-packages/apache_log_parser/__init__.py",
 line 225, in parse
     raise LineDoesntMatchException(log_line=log_line, regex=self.log_line_regex.pattern)

The reason was that the file handler was pointing to a gz file. No matter if I used the gzip library to decompress the file because this was a double compressed file *.gz.gz and therefore the decompressed file was in turn another gziped file.

So I try to use the python-magic library to find out the file type but it seems that a filename is needed.

     72         """
     73         self._thread_check()
---> 74         if not os.path.exists(filename):
     75             raise IOError("File does not exist: " + filename)
     76 

/usr/lib64/python2.7/genericpath.pyc in exists(path)
     16     """Test whether a path exists.  Returns False for broken symbolic links"""
     17     try:
---> 18         os.stat(path)
     19     except os.error:
     20         return False

I already implement a try: / expect: statement but this doesn't really solve the problem of processing a lot of useless lines.

What do you suggest to do? Thanks


Source: (StackOverflow)

How to reliable tell the uploaded file type (text or binary)?

I have an application where users should be able to upload a wide variety of files, but I need to know for each file, if I can safely display its textual representation as plain text.

Using python-magic like

m = Magic(mime=True).from_buffer(cgi.FieldStorage.file.read())

gives me the correct MIME type.

But sometimes, the MIME type for scripts is application/*, so simply looking for m.startswith('text/') is not enough.

Another site suggested using

m = Magic().from_buffer(cgi.FieldStorage.file.read())

and checking for 'text' in m.

Would the second approach be reliable enough for a collection of arbitrary file uploads or could someone give me another idea?

Thanks a lot.


Source: (StackOverflow)

python-magic WindowsError: access violation writing 0x00000000

I installed python-magic (0.4.6) on my Win 7 64bit using pip. I then installed cygwin 1.7.33-2 to provide the needed dlls and created a copy of cygmagic-1.dll named magic1.dll (see

When I run the Python 2.7.6 32bit shell, the "import magic" works fine.

However, a

magic.from_file('c:\user\username\sample.txt')

gives me a

Traceback (most recent call last):  
  File "<stdin>", line 1, in <module>  
  File "c:\Python27\lib\site-packages\magic.py", line 119, in from_file    
    m = _get_magic_type(mime)  
  File "c:\Python27\lib\site-packages\magic.py", line 107, in _get_magic_type  
    i = instances.__dict__[mime] = Magic(mime=mime)  
  File "c:\Python27\lib\site-packages\magic.py", line 55, in __init__
    self.cookie = magic_open(flags)  

WindowsError: exception: access violation writing 0x00000000

Any ideas what causes the this error and how I can fix it? Thank you for your help!


Source: (StackOverflow)

How does one use magic to verify file type in a Django form clean method?

I have written an email form class in Django with a FileField. I want to check the uploaded file for its type via checking its mimetype. Subsequently, I want to limit file types to pdfs, word, and open office documents.

To this end, I have installed python-magic and would like to check file types as follows per the specs for python-magic:

mime = magic.Magic(mime=True)
file_mime_type = mime.from_file('address/of/file.txt')

However, recently uploaded files lack addresses on my server. I also do not know of any method of the mime object akin to "from_file_content" that checks for the mime type given the content of the file.

What is an effective way to use magic to verify file types of uploaded files in Django forms?


Source: (StackOverflow)

Get MP3 MIME type using python

I am writing a script to determine if a file is a valid MP3 using python-magic. With some files, the magic.from_file() function returns use count (30) exceeded. Is it possible to raise the limit similar to the command line program: file --parameter name=40? If this is not possible with python-magic, is it possible with filemagic?


Source: (StackOverflow)

Python-magic installation error - ImportError: failed to find libmagic

I am trying to install python-magic for Windows and I have followed all the instructions in https://github.com/ahupp/python-magic and repeated the process several times but I am still getting this error:

ImportError: failed to find libmagic. Check your installation

I have magic1.dll (along with the two other files the docs specified) in C:\Windows\System32 so I am not sure what the issue is. I would appreciate any help or workarounds.


Source: (StackOverflow)

parse python functions as a string within decorator

I am trying to write a function debug decorator that will look at:

def foo(baz):
  bar = 1
  bar = 2
  return bar

and wrap it to:

def foo(baz):
  bar = 1
  print 'bar: {}'.format(bar)
  bar = 2
  print 'bar: {}'.format(bar)
  return bar

I need to play with the function as text, to grab "\w+(?=\s*[=])", but do not know how to access that. I have a decorator I modified from a blog that works, but I just tried changing it to:

class decorator_string_check(object):

   def __init__(self, func):
        self.func = func
        wraps(func)(self)

   def __call__(self, *args, **kwargs):
        print dir(self.func)
        print dir(self.func.__code__)
        print self.func.__code__.__str__()
        ret = self.func(*args, **kwargs)
        return ret

@decorator_string_check
def fake(x):
    y = 6
    y = x
    return y

y = fake(9)

and am getting nothinng of value, namely:

['__call__', '__class__', '__closure__', '__code__', '__defaults__', '__delattr__', '__dict__', '__doc__', '__format__', '__get__', '__getattribute__', '__globals__', '__hash__', '__init__', '__module__', '__name__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'func_closure', 'func_code', 'func_defaults', 'func_dict', 'func_doc', 'func_globals', 'func_name']
['__class__', '__cmp__', '__delattr__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__le__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'co_argcount', 'co_cellvars', 'co_code', 'co_consts', 'co_filename', 'co_firstlineno', 'co_flags', 'co_freevars', 'co_lnotab', 'co_name', 'co_names', 'co_nlocals', 'co_stacksize', 'co_varnames']
<code object fake at 0x7f98b8b1d030, file "./logging.py", line 48>

How do I work with the actual "func" text, to run regexes on it and find things I need within a decorator class object? Thank you


Source: (StackOverflow)

Missing files for `magic` library on Windows

I need to get mime type for some files on windows, so i've installed python-magic (on 32-bit python 2.7.3).
It depends on unix magic library.
Author instructs to get regex2.dll, zlib1.dll and magic1.dll from gnuwin32 project. So i saved the files to a folder and added the folder to my system PATH.
Now when i execute magic methods, i get missing file exception:

import magic
print(magic.Magic())

Traceback (most recent call last):
File "C:/Users/Admin/PycharmProjects/lex/lex.py", line 367, in <module>
  test_magic()
File "C:/Users/Admin/PycharmProjects/lex/lex.py", line 364, in test_magic
  print(magic.Magic())
File "C:\Python27\lib\site-packages\python_magic-0.4.3-py2.7.egg\magic.py", line 52, in __init__
  magic_load(self.cookie, magic_file)
File "C:\Python27\lib\site-packages\python_magic-0.4.3-py2.7.egg\magic.py", line 188, in magic_load
  return _magic_load(cookie, coerce_filename(filename))
File "C:\Python27\lib\site-packages\python_magic-0.4.3-py2.7.egg\magic.py", line 139, in errorcheck
  raise MagicException(err)
magic.MagicException: could not find any magic files!

DLLs are in the PATH, i tried debugging and magic1.dll is located correctly, but somewhere inside library throws an exception.
Inside the gnuwin32 package i've found magic and magic.mgc. I placed them to the same folder, and got WindowsError: [Error 126] on

libmagic = None  
# Let's try to find magic or magic1  
dll = ctypes.util.find_library('magic') or ctypes.util.find_library('magic1')  

# This is necessary because find_library returns None if it doesn't find the library
if dll:
    libmagic = ctypes.CDLL(dll)

This obviously happens because python tries to open magic file as dll, which is plain text. After adding .dll to filenames in the code i get the same magic.MagicException: could not find any magic files!.
What files am i missing?

UPDATE:

C:\Users\Admin>file C:\123.zip -m magic
file: could not find any magic files!

C:\Users\Admin>file C:\123.zip -m "C:\@DEV\@LIB\@Magic\GetGnuWin32\bin\magic"
C:\123.zip; ASCII text, with no line terminators

C:\Users\Admin>cd C:\@DEV\@LIB\@Magic\GetGnuWin32\bin

C:\@DEV\@LIB\@Magic\GetGnuWin32\bin>file C:\123.zip -m magic
C:\123.zip; ASCII text, with no line terminators

UPDATE 2 (SOLVED):

print(magic.Magic())
magic.MagicException: could not find any magic files!

print(magic.Magic(magic_file = 'magic'))
<magic.Magic instance at 0x02A5E198>

just had to specify file explicitly


Source: (StackOverflow)