EzDevInfo.com

lxml

The lxml XML toolkit for Python

Setup.py: install lxml with Python2.6 on CentOS

I have installed Python 2.6.6 on CentOS 5.4,

[siyuan.tong@SC-055 lxml-2.3beta1]$ python
Python 2.6.6 (r266:84292, Jan  4 2011, 09:49:55) 
[GCC 4.1.2 20080704 (Red Hat 4.1.2-46)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> 

I want to use the lxml module, but build from sources failed:

src/lxml/lxml.etree.c:157929: error: ‘xsltLibxsltVersion’ undeclared (first use in this function)
src/lxml/lxml.etree.c:157941: error: ‘__pyx_v_4lxml_5etree_XSLT_DOC_DEFAULT_LOADER’ undeclared (first use in this function)
src/lxml/lxml.etree.c:157941: error: ‘xsltDocDefaultLoader’ undeclared (first use in this function)
src/lxml/lxml.etree.c:157950: error: ‘__pyx_f_4lxml_5etree__xslt_doc_loader’ undeclared (first use in this function)
error: command 'gcc' failed with exit status 1

Source: (StackOverflow)

builtins.TypeError: must be str, not bytes

I've converted my scripts form python 2.7 to 3.2,and I have some bug.

# -*- coding: utf-8 -*-
import time
from datetime import date
from lxml import etree
from collections import OrderedDict

# Create the root element
page = etree.Element('results')

# Make a new document tree
doc = etree.ElementTree(page)

# Add the subelements
pageElement = etree.SubElement(page, 'Country',Tim = 'Now', 
                                      name='Germany', AnotherParameter = 'Bye',
                                      Code='DE',
                                      Storage='Basic')
pageElement = etree.SubElement(page, 'City', 
                                      name='Germany',
                                      Code='PZ',
                                      Storage='Basic',AnotherParameter = 'Hello')
# For multiple multiple attributes, use as shown above

# Save to XML file
outFile = open('output.xml', 'w')
doc.write(outFile) 

at last line I got an error:

builtins.TypeError: must be str, not bytes
File "C:\PythonExamples\XmlReportGeneratorExample.py", line 29, in <module>
  doc.write(outFile)
File "c:\Python32\Lib\site-packages\lxml\etree.pyd", line 1853, in lxml.etree._ElementTree.write (src/lxml/lxml.etree.c:44355)
File "c:\Python32\Lib\site-packages\lxml\etree.pyd", line 478, in lxml.etree._tofilelike (src/lxml/lxml.etree.c:90649)
File "c:\Python32\Lib\site-packages\lxml\etree.pyd", line 282, in lxml.etree._ExceptionContext._raise_if_stored (src/lxml/lxml.etree.c:7972)
File "c:\Python32\Lib\site-packages\lxml\etree.pyd", line 378, in lxml.etree._FilelikeWriter.write (src/lxml/lxml.etree.c:89527)

I've installed python 3.2, and I've installed lxml-2.3.win32-py3.2.exe .

At 2.7 it works.


Source: (StackOverflow)

Advertisements

Building lxml for Python 2.7 on Windows

I am trying to build lxml for Python 2.7 on Windows 64 bit machine. I couldn't find lxml egg for Python 2.7 version. So I am compiling it from sources. I am following instructions on this site

http://lxml.de/build.html

under static linking section. I am getting error

C:\Documents and Settings\Administrator\Desktop\lxmlpackage\lxml-2.2.6\lxml-2.2.
6>python setup.py bdist_wininst --static
Building lxml version 2.2.6.
NOTE: Trying to build without Cython, pre-generated 'src/lxml/lxml.etree.c' need
s to be available.
ERROR: 'xslt-config' is not recognized as an internal or external command,
operable program or batch file.

** make sure the development packages of libxml2 and libxslt are installed **

Using build configuration of libxslt
Building against libxml2/libxslt in one of the following directories:
  ..\libxml2-2.7.6--win32--w2k--x64\lib
  ..\libxslt-1.1.26--win32--w2k--x64--0002\lib
  ..\zlib-1.2.4--win32--w2k--x64
  ..\iconv-1.9.1--win32--w2k--x64-0001\lib
running bdist_wininst
running build
running build_py
running build_ext
building 'lxml.etree' extension
error: Unable to find vcvarsall.bat

Can any one help me with this? I tried setting the path to have Microsoft Visual Studio.. I can run vcvarsall.bat from the commandline.. but python is having problems


Source: (StackOverflow)

lxml etree xmlparser namespace problem

I have an xml doc that I am trying to parse using Etree.lxml

<Envelope xmlns="http://www.xxx.com/zzz/yyy">
  <Header>
    <Version>1</Version>
  </Header>
  <Body>
    some stuff
  <Body>
<Envelope>

My code is:

path = "path to xml file"
from lxml import etree as ET
parser = ET.XMLParser(ns_clean=True)
dom = ET.parse(path, parser)
dom.getroot()

When I try to get dom.getroot() I get:

<Element {http://www.xxx.com/zzz/yyy}Envelope at 28adacac>

However I only want:

<Element Envelope at 28adacac>

When i do

dom.getroot().find("Body")

I get nothing returned. However, when I

dom.getroot().find("{http://www.xxx.com/zzz/yyy}Body") 

I get a result.

I thought passing ns_clean=True to the parser would prevent this.

Any ideas?


Source: (StackOverflow)

Installing lxml module in python

while running a python script, I got this error

  from lxml import etree
ImportError: No module named lxml

now I tried to install lxml

sudo easy_install lmxl

but it gives me the following error

Building lxml version 2.3.beta1.
NOTE: Trying to build without Cython, pre-generated 'src/lxml/lxml.etree.c' needs to be available.
ERROR: /bin/sh: xslt-config: not found

** make sure the development packages of libxml2 and libxslt are installed **

Using build configuration of libxslt

src/lxml/lxml.etree.c:4: fatal error: Python.h: No such file or directory
compilation terminated.
error: Setup script exited with error: command 'gcc' failed with exit status 1

Source: (StackOverflow)

Find python lxml version

How can I find the installed python-lxml version in a Linux system?

>>> import lxml
>>> lxml.__version__
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'module' object has no attribute '__version__'

>>> from pprint import pprint
>>> pprint(dir(lxml))
['__builtins__',
 '__doc__',
 '__file__',
 '__name__',
 '__package__',
 '__path__',
 'get_include',
 'os']
>>>

Can't seem to find it


Source: (StackOverflow)

lxml installation error ubuntu 14.04 (internal compiler error)

I am having problems with installing lxml. I have tried the solutions of the relative questions in this site and other sites but could not fix the problem. Need some suggestions/solution on this.

I am providing the full log after executing pip install lxml,

Downloading/unpacking lxml
  Downloading lxml-3.3.5.tar.gz (3.5MB): 3.5MB downloaded
  Running setup.py (path:/tmp/pip_build_root/lxml/setup.py) egg_info for package lxml
    /usr/lib/python2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'bugtrack_url'
      warnings.warn(msg)
    Building lxml version 3.3.5.
    Building without Cython.
    Using build configuration of libxslt 1.1.28

    warning: no previously-included files found matching '*.py'
Installing collected packages: lxml
  Running setup.py install for lxml
    /usr/lib/python2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'bugtrack_url'
      warnings.warn(msg)
    Building lxml version 3.3.5.
    Building without Cython.
    Using build configuration of libxslt 1.1.28
    building 'lxml.etree' extension
    x86_64-linux-gnu-gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fPIC -I/usr/include/libxml2 -I/tmp/pip_build_root/lxml/src/lxml/includes -I/usr/include/python2.7 -c src/lxml/lxml.etree.c -o build/temp.linux-x86_64-2.7/src/lxml/lxml.etree.o -w
    x86_64-linux-gnu-gcc: internal compiler error: Killed (program cc1)
    Please submit a full bug report,
    with preprocessed source if appropriate.
    See <file:///usr/share/doc/gcc-4.8/README.Bugs> for instructions.
    error: command 'x86_64-linux-gnu-gcc' failed with exit status 4
    Complete output from command /usr/bin/python -c "import setuptools, tokenize;__file__='/tmp/pip_build_root/lxml/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-KUq9VD-record/install-record.txt --single-version-externally-managed --compile:
    /usr/lib/python2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'bugtrack_url'

  warnings.warn(msg)

Building lxml version 3.3.5.

Building without Cython.

Using build configuration of libxslt 1.1.28

running install

running build

running build_py

creating build

creating build/lib.linux-x86_64-2.7

creating build/lib.linux-x86_64-2.7/lxml

copying src/lxml/pyclasslookup.py -> build/lib.linux-x86_64-2.7/lxml

copying src/lxml/doctestcompare.py -> build/lib.linux-x86_64-2.7/lxml

copying src/lxml/builder.py -> build/lib.linux-x86_64-2.7/lxml

copying src/lxml/cssselect.py -> build/lib.linux-x86_64-2.7/lxml

copying src/lxml/sax.py -> build/lib.linux-x86_64-2.7/lxml

copying src/lxml/_elementpath.py -> build/lib.linux-x86_64-2.7/lxml

copying src/lxml/ElementInclude.py -> build/lib.linux-x86_64-2.7/lxml

copying src/lxml/__init__.py -> build/lib.linux-x86_64-2.7/lxml

copying src/lxml/usedoctest.py -> build/lib.linux-x86_64-2.7/lxml

creating build/lib.linux-x86_64-2.7/lxml/includes

copying src/lxml/includes/__init__.py -> build/lib.linux-x86_64-2.7/lxml/includes

creating build/lib.linux-x86_64-2.7/lxml/html

copying src/lxml/html/html5parser.py -> build/lib.linux-x86_64-2.7/lxml/html

copying src/lxml/html/ElementSoup.py -> build/lib.linux-x86_64-2.7/lxml/html

copying src/lxml/html/builder.py -> build/lib.linux-x86_64-2.7/lxml/html

copying src/lxml/html/clean.py -> build/lib.linux-x86_64-2.7/lxml/html

copying src/lxml/html/_diffcommand.py -> build/lib.linux-x86_64-2.7/lxml/html

copying src/lxml/html/diff.py -> build/lib.linux-x86_64-2.7/lxml/html

copying src/lxml/html/_html5builder.py -> build/lib.linux-x86_64-2.7/lxml/html

copying src/lxml/html/_setmixin.py -> build/lib.linux-x86_64-2.7/lxml/html

copying src/lxml/html/defs.py -> build/lib.linux-x86_64-2.7/lxml/html

copying src/lxml/html/formfill.py -> build/lib.linux-x86_64-2.7/lxml/html

copying src/lxml/html/soupparser.py -> build/lib.linux-x86_64-2.7/lxml/html

copying src/lxml/html/__init__.py -> build/lib.linux-x86_64-2.7/lxml/html

copying src/lxml/html/usedoctest.py -> build/lib.linux-x86_64-2.7/lxml/html

creating build/lib.linux-x86_64-2.7/lxml/isoschematron

copying src/lxml/isoschematron/__init__.py -> build/lib.linux-x86_64-2.7/lxml/isoschematron

copying src/lxml/lxml.etree.h -> build/lib.linux-x86_64-2.7/lxml

copying src/lxml/lxml.etree_api.h -> build/lib.linux-x86_64-2.7/lxml

copying src/lxml/includes/config.pxd -> build/lib.linux-x86_64-2.7/lxml/includes

copying src/lxml/includes/xpath.pxd -> build/lib.linux-x86_64-2.7/lxml/includes

copying src/lxml/includes/tree.pxd -> build/lib.linux-x86_64-2.7/lxml/includes

copying src/lxml/includes/dtdvalid.pxd -> build/lib.linux-x86_64-2.7/lxml/includes

copying src/lxml/includes/xmlparser.pxd -> build/lib.linux-x86_64-2.7/lxml/includes

copying src/lxml/includes/etreepublic.pxd -> build/lib.linux-x86_64-2.7/lxml/includes

copying src/lxml/includes/xslt.pxd -> build/lib.linux-x86_64-2.7/lxml/includes

copying src/lxml/includes/schematron.pxd -> build/lib.linux-x86_64-2.7/lxml/includes

copying src/lxml/includes/uri.pxd -> build/lib.linux-x86_64-2.7/lxml/includes

copying src/lxml/includes/c14n.pxd -> build/lib.linux-x86_64-2.7/lxml/includes

copying src/lxml/includes/xinclude.pxd -> build/lib.linux-x86_64-2.7/lxml/includes

copying src/lxml/includes/xmlschema.pxd -> build/lib.linux-x86_64-2.7/lxml/includes

copying src/lxml/includes/xmlerror.pxd -> build/lib.linux-x86_64-2.7/lxml/includes

copying src/lxml/includes/relaxng.pxd -> build/lib.linux-x86_64-2.7/lxml/includes

copying src/lxml/includes/htmlparser.pxd -> build/lib.linux-x86_64-2.7/lxml/includes

copying src/lxml/includes/etree_defs.h -> build/lib.linux-x86_64-2.7/lxml/includes

copying src/lxml/includes/lxml-version.h -> build/lib.linux-x86_64-2.7/lxml/includes

creating build/lib.linux-x86_64-2.7/lxml/isoschematron/resources

creating build/lib.linux-x86_64-2.7/lxml/isoschematron/resources/rng

copying src/lxml/isoschematron/resources/rng/iso-schematron.rng -> build/lib.linux-x86_64-2.7/lxml/isoschematron/resources/rng

creating build/lib.linux-x86_64-2.7/lxml/isoschematron/resources/xsl

copying src/lxml/isoschematron/resources/xsl/RNG2Schtrn.xsl -> build/lib.linux-x86_64-2.7/lxml/isoschematron/resources/xsl

copying src/lxml/isoschematron/resources/xsl/XSD2Schtrn.xsl -> build/lib.linux-x86_64-2.7/lxml/isoschematron/resources/xsl

creating build/lib.linux-x86_64-2.7/lxml/isoschematron/resources/xsl/iso-schematron-xslt1

copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/iso_schematron_skeleton_for_xslt1.xsl -> build/lib.linux-x86_64-2.7/lxml/isoschematron/resources/xsl/iso-schematron-xslt1

copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/iso_dsdl_include.xsl -> build/lib.linux-x86_64-2.7/lxml/isoschematron/resources/xsl/iso-schematron-xslt1

copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/iso_schematron_message.xsl -> build/lib.linux-x86_64-2.7/lxml/isoschematron/resources/xsl/iso-schematron-xslt1

copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/iso_svrl_for_xslt1.xsl -> build/lib.linux-x86_64-2.7/lxml/isoschematron/resources/xsl/iso-schematron-xslt1

copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/iso_abstract_expand.xsl -> build/lib.linux-x86_64-2.7/lxml/isoschematron/resources/xsl/iso-schematron-xslt1

copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/readme.txt -> build/lib.linux-x86_64-2.7/lxml/isoschematron/resources/xsl/iso-schematron-xslt1

running build_ext

building 'lxml.etree' extension

creating build/temp.linux-x86_64-2.7

creating build/temp.linux-x86_64-2.7/src

creating build/temp.linux-x86_64-2.7/src/lxml

x86_64-linux-gnu-gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fPIC -I/usr/include/libxml2 -I/tmp/pip_build_root/lxml/src/lxml/includes -I/usr/include/python2.7 -c src/lxml/lxml.etree.c -o build/temp.linux-x86_64-2.7/src/lxml/lxml.etree.o -w

x86_64-linux-gnu-gcc: internal compiler error: Killed (program cc1)

Please submit a full bug report,

with preprocessed source if appropriate.

See <file:///usr/share/doc/gcc-4.8/README.Bugs> for instructions.

error: command 'x86_64-linux-gnu-gcc' failed with exit status 4

----------------------------------------
Cleaning up...
Command /usr/bin/python -c "import setuptools, tokenize;__file__='/tmp/pip_build_root/lxml/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-KUq9VD-record/install-record.txt --single-version-externally-managed --compile failed with error code 1 in /tmp/pip_build_root/lxml
Storing debug log for failure in /root/.pip/pip.log

Also, the pip.log file looks like this,

status = self.run(options, args)
  File "/usr/lib/python2.7/dist-packages/pip/commands/install.py", line 283, in run
    requirement_set.install(install_options, global_options, root=options.root_path)
  File "/usr/lib/python2.7/dist-packages/pip/req.py", line 1435, in install
    requirement.install(install_options, global_options, *args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/pip/req.py", line 706, in install
    cwd=self.source_dir, filter_stdout=self._filter_install, show_stdout=False)
  File "/usr/lib/python2.7/dist-packages/pip/util.py", line 697, in call_subprocess
    % (command_desc, proc.returncode, cwd))
InstallationError: Command /usr/bin/python -c "import setuptools, tokenize;__file__='/tmp/pip_build_root/lxml/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-KUq9VD-record/install-record.txt --single-version-externally-managed --compile failed with error code 1 in /tmp/pip_build_root/lxml

dmesg | tail command shows output like this,

[1744367.676147] Out of memory: Kill process 25518 (cc1) score 388 or sacrifice child
[1744367.676665] Killed process 25518 (cc1) total-vm:242352kB, anon-rss:200608kB, file-rss:0kB

It's seems like a memory issue. I am taking reference form this question


Source: (StackOverflow)

Installing lxml with pip in virtualenv Ubuntu 12.10 error: command 'gcc' failed with exit status 4

I'm having the following error when trying to run "pip install lxml" into a virtualenv in Ubuntu 12.10 x64. I have Python 2.7.

I have seen other related questions here about the same problem and tried installing python-dev, libxml2-dev and libxslt1-dev.

Please take a look of the traceback from the moment I tip the command to the moment when the error occurs.

Downloading/unpacking lxml
  Running setup.py egg_info for package lxml
    /usr/lib/python2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'bugtrack_url'
      warnings.warn(msg)
    Building lxml version 3.1.2.
    Building without Cython.
    Using build configuration of libxslt 1.1.26
    Building against libxml2/libxslt in the following directory: /usr/lib

    warning: no files found matching '*.txt' under directory 'src/lxml/tests'
Installing collected packages: lxml
  Running setup.py install for lxml
    /usr/lib/python2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'bugtrack_url'
      warnings.warn(msg)
    Building lxml version 3.1.2.
    Building without Cython.
    Using build configuration of libxslt 1.1.26
    Building against libxml2/libxslt in the following directory: /usr/lib
    building 'lxml.etree' extension
    gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fPIC -I/usr/include/libxml2 -I/home/admin/.virtualenvs/dev.actualito.com/build/lxml/src/lxml/includes -I/usr/include/python2.7 -c src/lxml/lxml.etree.c -o build/temp.linux-x86_64-2.7/src/lxml/lxml.etree.o
    src/lxml/lxml.etree.c: In function '__pyx_f_4lxml_5etree__getFilenameForFile':
    src/lxml/lxml.etree.c:26851:7: warning: variable '__pyx_clineno' set but not used [-Wunused-but-set-variable]
    src/lxml/lxml.etree.c:26850:15: warning: variable '__pyx_filename' set but not used [-Wunused-but-set-variable]
    src/lxml/lxml.etree.c:26849:7: warning: variable '__pyx_lineno' set but not used [-Wunused-but-set-variable]
    src/lxml/lxml.etree.c: In function '__pyx_pf_4lxml_5etree_4XSLT_18__call__':
    src/lxml/lxml.etree.c:138273:81: warning: passing argument 1 of '__pyx_f_4lxml_5etree_12_XSLTContext__copy' from incompatible pointer type [enabled by default]
    src/lxml/lxml.etree.c:136229:52: note: expected 'struct __pyx_obj_4lxml_5etree__XSLTContext *' but argument is of type 'struct __pyx_obj_4lxml_5etree__BaseContext *'
    src/lxml/lxml.etree.c: In function '__pyx_f_4lxml_5etree__copyXSLT':
    src/lxml/lxml.etree.c:139667:79: warning: passing argument 1 of '__pyx_f_4lxml_5etree_12_XSLTContext__copy' from incompatible pointer type [enabled by default]
    src/lxml/lxml.etree.c:136229:52: note: expected 'struct __pyx_obj_4lxml_5etree__XSLTContext *' but argument is of type 'struct __pyx_obj_4lxml_5etree__BaseContext *'
    src/lxml/lxml.etree.c: At top level:
    src/lxml/lxml.etree.c:12384:13: warning: '__pyx_f_4lxml_5etree_displayNode' defined but not used [-Wunused-function]
    gcc: internal compiler error: Killed (program cc1)
    Please submit a full bug report,
    with preprocessed source if appropriate.
    See  for instructions.
    error: command 'gcc' failed with exit status 4
    Complete output from command /home/admin/.virtualenvs/dev.actualito.com/bin/python -c "import setuptools;__file__='/home/admin/.virtualenvs/dev.actualito.com/build/lxml/setup.py';exec(compile(open(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-asDtN5-record/install-record.txt --single-version-externally-managed --install-headers /home/admin/.virtualenvs/dev.actualito.com/include/site/python2.7:
    /usr/lib/python2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'bugtrack_url'

  warnings.warn(msg)

Building lxml version 3.1.2.

Building without Cython.

Using build configuration of libxslt 1.1.26

Building against libxml2/libxslt in the following directory: /usr/lib

running install

running build

running build_py

copying src/lxml/includes/lxml-version.h -> build/lib.linux-x86_64-2.7/lxml/includes

running build_ext

building 'lxml.etree' extension

gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fPIC -I/usr/include/libxml2 -I/home/admin/.virtualenvs/dev.actualito.com/build/lxml/src/lxml/includes -I/usr/include/python2.7 -c src/lxml/lxml.etree.c -o build/temp.linux-x86_64-2.7/src/lxml/lxml.etree.o

src/lxml/lxml.etree.c: In function '__pyx_f_4lxml_5etree__getFilenameForFile':

src/lxml/lxml.etree.c:26851:7: warning: variable '__pyx_clineno' set but not used [-Wunused-but-set-variable]

src/lxml/lxml.etree.c:26850:15: warning: variable '__pyx_filename' set but not used [-Wunused-but-set-variable]

src/lxml/lxml.etree.c:26849:7: warning: variable '__pyx_lineno' set but not used [-Wunused-but-set-variable]

src/lxml/lxml.etree.c: In function '__pyx_pf_4lxml_5etree_4XSLT_18__call__':

src/lxml/lxml.etree.c:138273:81: warning: passing argument 1 of '__pyx_f_4lxml_5etree_12_XSLTContext__copy' from incompatible pointer type [enabled by default]

src/lxml/lxml.etree.c:136229:52: note: expected 'struct __pyx_obj_4lxml_5etree__XSLTContext *' but argument is of type 'struct __pyx_obj_4lxml_5etree__BaseContext *'

src/lxml/lxml.etree.c: In function '__pyx_f_4lxml_5etree__copyXSLT':

src/lxml/lxml.etree.c:139667:79: warning: passing argument 1 of '__pyx_f_4lxml_5etree_12_XSLTContext__copy' from incompatible pointer type [enabled by default]

src/lxml/lxml.etree.c:136229:52: note: expected 'struct __pyx_obj_4lxml_5etree__XSLTContext *' but argument is of type 'struct __pyx_obj_4lxml_5etree__BaseContext *'

src/lxml/lxml.etree.c: At top level:

src/lxml/lxml.etree.c:12384:13: warning: '__pyx_f_4lxml_5etree_displayNode' defined but not used [-Wunused-function]

gcc: internal compiler error: Killed (program cc1)

Please submit a full bug report,

with preprocessed source if appropriate.

See  for instructions.

error: command 'gcc' failed with exit status 4

----------------------------------------
Command /home/admin/.virtualenvs/dev.actualito.com/bin/python -c "import setuptools;__file__='/home/admin/.virtualenvs/dev.actualito.com/build/lxml/setup.py';exec(compile(open(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-asDtN5-record/install-record.txt --single-version-externally-managed --install-headers /home/admin/.virtualenvs/dev.actualito.com/include/site/python2.7 failed with error code 1 in /home/admin/.virtualenvs/dev.actualito.com/build/lxml
Storing complete log in /home/admin/.pip/pip.log

Source: (StackOverflow)

How do you install lxml on OS X Leopard without using MacPorts or Fink?

I've tried this and run in to problems a bunch of times in the past. Does anyone have a recipe for installing lxml on OS X without MacPorts or Fink that definitely works?

Preferably with complete 1-2-3 steps for downloading and building each of the dependencies.


Source: (StackOverflow)

How to select following sibling/xml tag using xpath

I have an HTML file (from Newegg) and their HTML is organized like below. All of the data in their specifications table is 'desc' while the titles of each section are in 'name.' Below are two examples of data from Newegg pages.

<tr>
    <td class="name">Brand</td>
    <td class="desc">Intel</td>
</tr>
<tr>
    <td class="name">Series</td>
    <td class="desc">Core i5</td>
</tr>
<tr>
    <td class="name">Cores</td>
    <td class="desc">4</td>
</tr>
<tr>
    <td class="name">Socket</td>
    <td class="desc">LGA 1156</td>

<tr>
    <td class="name">Brand</td>
    <td class="desc">AMD</td>
</tr>
<tr>
    <td class="name">Series</td>
    <td class="desc">Phenom II X4</td>
</tr>
<tr>
    <td class="name">Cores</td>
    <td class="desc">4</td>
</tr>
<tr>
    <td class="name">Socket</td>
    <td class="desc">Socket AM3</td>
</tr>

In the end I would like to have a class for a CPU (which is already set up) that consists of a Brand, Series, Cores, and Socket type to store each of the data. This is the only way I can think of to go about doing this:

if(parsedDocument.xpath(tr/td[@class="name"])=='Brand'):
    CPU.brand = parsedDocument.xpath(tr/td[@class="name"]/nextsibling?).text

And doing this for the rest of the values. How would I accomplish the nextsibling and is there an easier way of doing this?


Source: (StackOverflow)

Parsing HTML in python - lxml or BeautifulSoup? Which of these is better for what kinds of purposes?

From what I can make out, the two main HTML parsing libraries in Python are lxml and BeautifulSoup. I've chosen BeautifulSoup for a project I'm working on, but I chose it for no particular reason other than finding the syntax a bit easier to learn and understand. But I see a lot of people seem to favour lxml and I've heard that lxml is faster.

So I'm wondering what are the advantages of one over the other? When would I want to use lxml and when would I be better off using BeautifulSoup? Are there any other libraries worth considering?


Source: (StackOverflow)

src/lxml/etree_defs.h:9:31: fatal error: libxml/xmlversion.h: No such file or directory

  1. I am running the following comand for installing the packages in that file " pip install -r requirements.txt --download-cache=~/tmp/pip-cache".

  2. requirement.txt contains pacakages like

# Data formats
# ------------
PIL==1.1.7 # 

html5lib==0.90
httplib2==0.7.4
lxml==2.3.1

# Documentation
# -------------
Sphinx==1.1
docutils==0.8.1

# Testing
# -------
behave==1.1.0
dingus==0.3.2
django-testscenarios==0.7.2
mechanize==0.2.5
mock==0.7.2
testscenarios==0.2
testtools==0.9.14
wsgi_intercept==0.5.1

while comming to install "lxml" packages i am getting the following eror

Requirement already satisfied (use --upgrade to upgrade): django-testproject>=0.1.1 in /usr/lib/python2.7/site-packages/django_testproject-0.1.1-py2.7.egg (from django-testscenarios==0.7.2->-r requirements.txt (line 33))
Installing collected packages: lxml, Sphinx, docutils, behave, dingus, mock, testscenarios, testtools, South
  Running setup.py install for lxml
    Building lxml version 2.3.1.
    Building without Cython.
    ERROR: /bin/sh: xslt-config: command not found

    ** make sure the development packages of libxml2 and libxslt are installed **

    Using build configuration of libxslt
    building 'lxml.etree' extension
    gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i686 -mtune=atom -fasynchronous-unwind-tables -D_GNU_SOURCE -fPIC -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i686 -mtune=atom -fasynchronous-unwind-tables -D_GNU_SOURCE -fPIC -fwrapv -fPIC -I/usr/include/python2.7 -c src/lxml/lxml.etree.c -o build/temp.linux-i686-2.7/src/lxml/lxml.etree.o -w
    In file included from src/lxml/lxml.etree.c:239:0:
    src/lxml/etree_defs.h:9:31: fatal error: libxml/xmlversion.h: No such file or directory
    compilation terminated.
    error: command 'gcc' failed with exit status 1
    Complete output from command /usr/bin/python -c "import setuptools;__file__='/root/Projects/ir/build/lxml/setup.py';execfile(__file__)" install --single-version-externally-managed --record /tmp/pip-SwjFm3-record/install-record.txt:
    Building lxml version 2.3.1.

Building without Cython.

ERROR: /bin/sh: xslt-config: command not found



** make sure the development packages of libxml2 and libxslt are installed **



Using build configuration of libxslt

running install

running build

running build_py

copying src/lxml/cssselect.py -> build/lib.linux-i686-2.7/lxml

copying src/lxml/__init__.py -> build/lib.linux-i686-2.7/lxml

copying src/lxml/sax.py -> build/lib.linux-i686-2.7/lxml

copying src/lxml/pyclasslookup.py -> build/lib.linux-i686-2.7/lxml

copying src/lxml/usedoctest.py -> build/lib.linux-i686-2.7/lxml

copying src/lxml/doctestcompare.py -> build/lib.linux-i686-2.7/lxml

copying src/lxml/_elementpath.py -> build/lib.linux-i686-2.7/lxml

copying src/lxml/ElementInclude.py -> build/lib.linux-i686-2.7/lxml

copying src/lxml/builder.py -> build/lib.linux-i686-2.7/lxml

copying src/lxml/html/clean.py -> build/lib.linux-i686-2.7/lxml/html

copying src/lxml/html/__init__.py -> build/lib.linux-i686-2.7/lxml/html

copying src/lxml/html/_dictmixin.py -> build/lib.linux-i686-2.7/lxml/html

copying src/lxml/html/ElementSoup.py -> build/lib.linux-i686-2.7/lxml/html

copying src/lxml/html/usedoctest.py -> build/lib.linux-i686-2.7/lxml/html

copying src/lxml/html/defs.py -> build/lib.linux-i686-2.7/lxml/html

copying src/lxml/html/builder.py -> build/lib.linux-i686-2.7/lxml/html

copying src/lxml/html/_html5builder.py -> build/lib.linux-i686-2.7/lxml/html

copying src/lxml/html/diff.py -> build/lib.linux-i686-2.7/lxml/html

copying src/lxml/html/html5parser.py -> build/lib.linux-i686-2.7/lxml/html

copying src/lxml/html/_diffcommand.py -> build/lib.linux-i686-2.7/lxml/html

copying src/lxml/html/_setmixin.py -> build/lib.linux-i686-2.7/lxml/html

copying src/lxml/html/soupparser.py -> build/lib.linux-i686-2.7/lxml/html

copying src/lxml/html/formfill.py -> build/lib.linux-i686-2.7/lxml/html

copying src/lxml/isoschematron/__init__.py -> build/lib.linux-i686-2.7/lxml/isoschematron

copying src/lxml/etreepublic.pxd -> build/lib.linux-i686-2.7/lxml

copying src/lxml/tree.pxd -> build/lib.linux-i686-2.7/lxml

copying src/lxml/etree_defs.h -> build/lib.linux-i686-2.7/lxml

copying src/lxml/isoschematron/resources/rng/iso-schematron.rng -> build/lib.linux-i686-2.7/lxml/isoschematron/resources/rng

copying src/lxml/isoschematron/resources/xsl/XSD2Schtrn.xsl -> build/lib.linux-i686-2.7/lxml/isoschematron/resources/xsl

copying src/lxml/isoschematron/resources/xsl/RNG2Schtrn.xsl -> build/lib.linux-i686-2.7/lxml/isoschematron/resources/xsl

copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/iso_svrl_for_xslt1.xsl -> build/lib.linux-i686-2.7/lxml/isoschematron/resources/xsl/iso-schematron-xslt1

copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/iso_schematron_message.xsl -> build/lib.linux-i686-2.7/lxml/isoschematron/resources/xsl/iso-schematron-xslt1

copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/iso_schematron_skeleton_for_xslt1.xsl -> build/lib.linux-i686-2.7/lxml/isoschematron/resources/xsl/iso-schematron-xslt1

copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/iso_abstract_expand.xsl -> build/lib.linux-i686-2.7/lxml/isoschematron/resources/xsl/iso-schematron-xslt1

copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/iso_dsdl_include.xsl -> build/lib.linux-i686-2.7/lxml/isoschematron/resources/xsl/iso-schematron-xslt1

copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/readme.txt -> build/lib.linux-i686-2.7/lxml/isoschematron/resources/xsl/iso-schematron-xslt1

running build_ext

building 'lxml.etree' extension

gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i686 -mtune=atom -fasynchronous-unwind-tables -D_GNU_SOURCE -fPIC -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i686 -mtune=atom -fasynchronous-unwind-tables -D_GNU_SOURCE -fPIC -fwrapv -fPIC -I/usr/include/python2.7 -c src/lxml/lxml.etree.c -o build/temp.linux-i686-2.7/src/lxml/lxml.etree.o -w

In file included from src/lxml/lxml.etree.c:239:0:

src/lxml/etree_defs.h:9:31: fatal error: libxml/xmlversion.h: No such file or directory

compilation terminated.

error: command 'gcc' failed with exit status 1

----------------------------------------
Command /usr/bin/python -c "import setuptools;__file__='/root/Projects/ir/build/lxml/setup.py';execfile(__file__)" install --single-version-externally-managed --record /tmp/pip-SwjFm3-record/install-record.txt failed with error code 1
Storing complete log in /root/.pip/pip.log

Can anyone check and guide me what would be the problem and why?Any package missing to install.


Source: (StackOverflow)

How to install lxml on Ubuntu

I'm having difficulty installing lxml with easy_install on Ubuntu 11.

When I type $ easy_install lxml I get:

Searching for lxml
Reading http://pypi.python.org/simple/lxml/
Reading http://codespeak.net/lxml
Best match: lxml 2.3
Downloading http://lxml.de/files/lxml-2.3.tgz
Processing lxml-2.3.tgz
Running lxml-2.3/setup.py -q bdist_egg --dist-dir /tmp/easy_install-7UdQOZ/lxml-2.3/egg-dist-tmp-GacQGy
Building lxml version 2.3.
Building without Cython.
ERROR: /bin/sh: xslt-config: not found

** make sure the development packages of libxml2 and libxslt are installed **

Using build configuration of libxslt 
In file included from src/lxml/lxml.etree.c:227:0:
src/lxml/etree_defs.h:9:31: fatal error: libxml/xmlversion.h: No such file or directory
compilation terminated.

It seems that libxslt or libxml2 is not installed. I've tried following the instructions at http://www.techsww.com/tutorials/libraries/libxslt/installation/installing_libxslt_on_ubuntu_linux.php and http://www.techsww.com/tutorials/libraries/libxml/installation/installing_libxml_on_ubuntu_linux.php with no success.

If I try wget ftp://xmlsoft.org/libxml2/libxml2-sources-2.6.27.tar.gz I get

<successful connection info>
==> SYST ... done.    ==> PWD ... done.
==> TYPE I ... done.  ==> CWD (1) /libxml2 ... done.
==> SIZE libxml2-sources-2.6.27.tar.gz ... done.
==> PASV ... done.    ==> RETR libxml2-sources-2.6.27.tar.gz ... 
No such file `libxml2-sources-2.6.27.tar.gz'.

If I try the other first, I'll get to ./configure --prefix=/usr/local/libxslt --with-libxml-prefix=/usr/local/libxml2 and that will fail eventually with:

checking for libxml libraries >= 2.6.27... configure: error: Could not find libxml2 anywhere, check ftp://xmlsoft.org/.

I've tried both versions 2.6.27 and 2.6.29 of libxml2 with no difference.

Leaving no stone unturned, I have successfully done sudo apt-get install libxml2-dev, but this changes nothing.


Source: (StackOverflow)

easy_install lxml on Python 2.7 on Windows

I'm using python 2.7 on Windows. How come the following error occurs when I try to install [lxml][1] using [setuptools][2]'s easy_install?

C:\>easy_install lxml
Searching for lxml
Reading http://pypi.python.org/simple/lxml/
Reading http://codespeak.net/lxml
Best match: lxml 2.3.3
Downloading http://lxml.de/files/lxml-2.3.3.tgz
Processing lxml-2.3.3.tgz
Running lxml-2.3.3\setup.py -q bdist_egg --dist-dir c:\users\my_user\appdata\local\temp\easy_install-mtrdj2\lxml-2.3.3\egg-dist-tmp-tq8rx4
Building lxml version 2.3.3.
Building without Cython.
ERROR: 'xslt-config' is not recognized as an internal or external command,
operable program or batch file.

** make sure the development packages of libxml2 and libxslt are installed **

Using build configuration of libxslt
warning: no files found matching 'lxml.etree.c' under directory 'src\lxml'
warning: no files found matching 'lxml.objectify.c' under directory 'src\lxml'
warning: no files found matching 'lxml.etree.h' under directory 'src\lxml'
warning: no files found matching 'lxml.etree_api.h' under directory 'src\lxml'
warning: no files found matching 'etree_defs.h' under directory 'src\lxml'
warning: no files found matching 'pubkey.asc' under directory 'doc'
warning: no files found matching 'tagpython*.png' under directory 'doc'
warning: no files found matching 'Makefile' under directory 'doc'
error: Setup script exited with error: Unable to find vcvarsall.bat

Downloading the package and running setup.py install also doesn't help:

D:\My Documents\Installs\Dev\Python\lxml\lxml-2.3.3>setup.py install
Building lxml version 2.3.3.
Building without Cython.
ERROR: 'xslt-config' is not recognized as an internal or external command,
operable program or batch file.

** make sure the development packages of libxml2 and libxslt are installed **

Using build configuration of libxslt
running install
running bdist_egg
running egg_info
writing src\lxml.egg-info\PKG-INFO
writing top-level names to src\lxml.egg-info\top_level.txt
writing dependency_links to src\lxml.egg-info\dependency_links.txt
reading manifest file 'src\lxml.egg-info\SOURCES.txt'
reading manifest template 'MANIFEST.in'
warning: no files found matching 'lxml.etree.c' under directory 'src\lxml'
warning: no files found matching 'lxml.objectify.c' under directory 'src\lxml'
warning: no files found matching 'lxml.etree.h' under directory 'src\lxml'
warning: no files found matching 'lxml.etree_api.h' under directory 'src\lxml'
warning: no files found matching 'etree_defs.h' under directory 'src\lxml'
warning: no files found matching 'pubkey.asc' under directory 'doc'
warning: no files found matching 'tagpython*.png' under directory 'doc'
warning: no files found matching 'Makefile' under directory 'doc'
writing manifest file 'src\lxml.egg-info\SOURCES.txt'
installing library code to build\bdist.win32\egg
running install_lib
running build_py
creating build
creating build\lib.win32-2.7
creating build\lib.win32-2.7\lxml
copying src\lxml\builder.py -> build\lib.win32-2.7\lxml
copying src\lxml\cssselect.py -> build\lib.win32-2.7\lxml
copying src\lxml\doctestcompare.py -> build\lib.win32-2.7\lxml
copying src\lxml\ElementInclude.py -> build\lib.win32-2.7\lxml
copying src\lxml\pyclasslookup.py -> build\lib.win32-2.7\lxml
copying src\lxml\sax.py -> build\lib.win32-2.7\lxml
copying src\lxml\usedoctest.py -> build\lib.win32-2.7\lxml
copying src\lxml\_elementpath.py -> build\lib.win32-2.7\lxml
copying src\lxml\__init__.py -> build\lib.win32-2.7\lxml
creating build\lib.win32-2.7\lxml\html
copying src\lxml\html\builder.py -> build\lib.win32-2.7\lxml\html
copying src\lxml\html\clean.py -> build\lib.win32-2.7\lxml\html
copying src\lxml\html\defs.py -> build\lib.win32-2.7\lxml\html
copying src\lxml\html\diff.py -> build\lib.win32-2.7\lxml\html
copying src\lxml\html\ElementSoup.py -> build\lib.win32-2.7\lxml\html
copying src\lxml\html\formfill.py -> build\lib.win32-2.7\lxml\html
copying src\lxml\html\html5parser.py -> build\lib.win32-2.7\lxml\html
copying src\lxml\html\soupparser.py -> build\lib.win32-2.7\lxml\html
copying src\lxml\html\usedoctest.py -> build\lib.win32-2.7\lxml\html
copying src\lxml\html\_dictmixin.py -> build\lib.win32-2.7\lxml\html
copying src\lxml\html\_diffcommand.py -> build\lib.win32-2.7\lxml\html
copying src\lxml\html\_html5builder.py -> build\lib.win32-2.7\lxml\html
copying src\lxml\html\_setmixin.py -> build\lib.win32-2.7\lxml\html
copying src\lxml\html\__init__.py -> build\lib.win32-2.7\lxml\html
creating build\lib.win32-2.7\lxml\isoschematron
copying src\lxml\isoschematron\__init__.py -> build\lib.win32-2.7\lxml\isoschematron
copying src\lxml\etreepublic.pxd -> build\lib.win32-2.7\lxml
copying src\lxml\tree.pxd -> build\lib.win32-2.7\lxml
copying src\lxml\etree_defs.h -> build\lib.win32-2.7\lxml
creating build\lib.win32-2.7\lxml\isoschematron\resources
creating build\lib.win32-2.7\lxml\isoschematron\resources\rng
copying src\lxml\isoschematron\resources\rng\iso-schematron.rng -> build\lib.win32-2.7\lxml\isoschematron\resources\rng
creating build\lib.win32-2.7\lxml\isoschematron\resources\xsl
copying src\lxml\isoschematron\resources\xsl\RNG2Schtrn.xsl -> build\lib.win32-2.7\lxml\isoschematron\resources\xsl
copying src\lxml\isoschematron\resources\xsl\XSD2Schtrn.xsl -> build\lib.win32-2.7\lxml\isoschematron\resources\xsl
creating build\lib.win32-2.7\lxml\isoschematron\resources\xsl\iso-schematron-xslt1
copying src\lxml\isoschematron\resources\xsl\iso-schematron-xslt1\iso_abstract_expand.xsl -> build\lib.win32-2.7\lxml\isoschematron\resources\xsl\iso-schematron-xslt1
copying src\lxml\isoschematron\resources\xsl\iso-schematron-xslt1\iso_dsdl_include.xsl -> build\lib.win32-2.7\lxml\isoschematron\resources\xsl\iso-schematron-xslt1
copying src\lxml\isoschematron\resources\xsl\iso-schematron-xslt1\iso_schematron_message.xsl -> build\lib.win32-2.7\lxml\isoschematron\resources\xsl\iso-schematron-xslt1
copying src\lxml\isoschematron\resources\xsl\iso-schematron-xslt1\iso_schematron_skeleton_for_xslt1.xsl -> build\lib.win32-2.7\lxml\isoschematron\resources\xsl\iso-schematron-xslt1
copying src\lxml\isoschematron\resources\xsl\iso-schematron-xslt1\iso_svrl_for_xslt1.xsl -> build\lib.win32-2.7\lxml\isoschematron\resources\xsl\iso-schematron-xslt1
copying src\lxml\isoschematron\resources\xsl\iso-schematron-xslt1\readme.txt -> build\lib.win32-2.7\lxml\isoschematron\resources\xsl\iso-schematron-xslt1
running build_ext
building 'lxml.etree' extension
error: Unable to find vcvarsall.bat

  [1]: http://lxml.de/
  [2]: http://pypi.python.org/pypi/setuptools

Source: (StackOverflow)

Write xml file using lxml library in Python

I'm using lxml to create an XML file from scratch; having a code like this:

from lxml import etree

root = etree.Element("root")
root.set("interesting", "somewhat")
child1 = etree.SubElement(root, "test")

How do I write root Element object to an xml file using write() method of ElementTree class?


Source: (StackOverflow)