lxml
– Extractors for XML or HTML data extracting.¶
-
class
data_extractor.lxml.
AttrCSSExtractor
(expr: str, attr: str)¶ Bases:
data_extractor.lxml.CSSExtractor
Use CSS Selector for XML or HTML data subelements’ attribute value extracting.
Before extracting, should parse the XML or HTML text into
data_extractor.lxml.Element
object.- Parameters
expr (str) – CSS Selector Expression.
attr (str) – Target attribute name.
-
extract
()¶ Extract subelements’ attribute value from XML or HTML data.
- Parameters
element (
data_extractor.lxml.Element
) – Target.- Returns
List of str, extracted result.
- Return type
list
- Raises
ExprError – CSS Selector Expression Error.
-
extract_first
(element: Any, default: Any = sentinel) → Any¶ Extract the first data or subelement from extract method call result.
- Parameters
element (Any) – The target data node element.
default (Any, optional) – Default value when not found. Default:
data_extractor.utils.sentinel
.
- Returns
Data or subelement.
- Return type
Any
- Raises
ExtractError – Thrown by extractor extracting wrong data.
-
class
data_extractor.lxml.
CSSExtractor
(expr: str)¶ Bases:
data_extractor.core.AbstractSimpleExtractor
Use CSS Selector for XML or HTML data subelements extracting.
Before extracting, should parse the XML or HTML text into
data_extractor.lxml.Element
object.- Parameters
expr (str) – CSS Selector Expression.
-
extract
()¶ Extract subelements from XML or HTML data.
- Parameters
element (
data_extractor.lxml.Element
) – Target.- Returns
List of
data_extractor.lxml.Element
objects, extracted result.- Return type
list
-
extract_first
(element: Any, default: Any = sentinel) → Any¶ Extract the first data or subelement from extract method call result.
- Parameters
element (Any) – The target data node element.
default (Any, optional) – Default value when not found. Default:
data_extractor.utils.sentinel
.
- Returns
Data or subelement.
- Return type
Any
- Raises
ExtractError – Thrown by extractor extracting wrong data.
-
data_extractor.lxml.
Element
¶ alias of
lxml.etree._Element
-
class
data_extractor.lxml.
TextCSSExtractor
(expr: str)¶ Bases:
data_extractor.lxml.CSSExtractor
Use CSS Selector for XML or HTML data subelements’ text extracting.
Before extracting, should parse the XML or HTML text into
data_extractor.lxml.Element
object.- Parameters
expr (str) – CSS Selector Expression.
-
extract
()¶ Extract subelements’ text from XML or HTML data.
- Parameters
element (
data_extractor.lxml.Element
) – Target.- Returns
List of str, extracted result.
- Return type
list
- Raises
ExprError – CSS Selector Expression Error.
-
extract_first
(element: Any, default: Any = sentinel) → Any¶ Extract the first data or subelement from extract method call result.
- Parameters
element (Any) – The target data node element.
default (Any, optional) – Default value when not found. Default:
data_extractor.utils.sentinel
.
- Returns
Data or subelement.
- Return type
Any
- Raises
ExtractError – Thrown by extractor extracting wrong data.
-
class
data_extractor.lxml.
XPathExtractor
(expr: str)¶ Bases:
data_extractor.core.AbstractSimpleExtractor
Use XPath for XML or HTML data extracting.
Before extracting, should parse the XML or HTML text into
data_extractor.lxml.Element
object.- Parameters
expr – XPath Expression.
-
extract
()¶ Extract subelements or data from XML or HTML data.
- Parameters
element (
data_extractor.lxml.Element
) – Target.- Returns
List of
data_extractor.lxml.Element
objects, List of str, or str.- Return type
list
- Raises
data_extractor.exceptions.ExprError – XPath Expression Error.
-
extract_first
(element: Any, default: Any = sentinel) → Any¶ Extract the first data or subelement from extract method call result.
- Parameters
element (Any) – The target data node element.
default (Any, optional) – Default value when not found. Default:
data_extractor.utils.sentinel
.
- Returns
Data or subelement.
- Return type
Any
- Raises
ExtractError – Thrown by extractor extracting wrong data.