lxml – Extractors for XML or HTML data extracting.¶
- class data_extractor.lxml.AttrCSSExtractor(expr: str, attr: str)¶
Bases:
CSSExtractorUse CSS Selector for XML or HTML data subelements’ attribute value extracting.
Before extracting, should parse the XML or HTML text into
data_extractor.lxml.Elementobject.- Parameters:
expr (str) – CSS Selector Expression.
attr (str) – Target attribute name.
- extract(element: _Element) List[str]¶
Extract subelements’ attribute value from XML or HTML data.
- Parameters:
element (
data_extractor.lxml.Element) – Target.- Returns:
List of str, extracted result.
- Return type:
list
- Raises:
ExprError – CSS Selector Expression Error.
- extract_first(element: Any, default: Any = sentinel) Any¶
Extract the first data or subelement from extract method call result.
- Parameters:
element (Any) – The target data node element.
default (Any, optional) – Default value when not found. Default:
data_extractor.utils.sentinel.
- Returns:
Data or subelement.
- Return type:
Any
- Raises:
ExtractError – Thrown by extractor extracting wrong data.
- class data_extractor.lxml.CSSExtractor(expr: str)¶
Bases:
AbstractSimpleExtractorUse CSS Selector for XML or HTML data subelements extracting.
Before extracting, should parse the XML or HTML text into
data_extractor.lxml.Elementobject.- Parameters:
expr (str) – CSS Selector Expression.
- extract(element: _Element) List[_Element]¶
Extract subelements from XML or HTML data.
- Parameters:
element (
data_extractor.lxml.Element) – Target.- Returns:
List of
data_extractor.lxml.Elementobjects, extracted result.- Return type:
list
- extract_first(element: Any, default: Any = sentinel) Any¶
Extract the first data or subelement from extract method call result.
- Parameters:
element (Any) – The target data node element.
default (Any, optional) – Default value when not found. Default:
data_extractor.utils.sentinel.
- Returns:
Data or subelement.
- Return type:
Any
- Raises:
ExtractError – Thrown by extractor extracting wrong data.
- data_extractor.lxml.Element¶
alias of
_Element
- class data_extractor.lxml.TextCSSExtractor(expr: str)¶
Bases:
CSSExtractorUse CSS Selector for XML or HTML data subelements’ text extracting.
Before extracting, should parse the XML or HTML text into
data_extractor.lxml.Elementobject.- Parameters:
expr (str) – CSS Selector Expression.
- extract(element: _Element) List[str]¶
Extract subelements’ text from XML or HTML data.
- Parameters:
element (
data_extractor.lxml.Element) – Target.- Returns:
List of str, extracted result.
- Return type:
list
- Raises:
ExprError – CSS Selector Expression Error.
- extract_first(element: Any, default: Any = sentinel) Any¶
Extract the first data or subelement from extract method call result.
- Parameters:
element (Any) – The target data node element.
default (Any, optional) – Default value when not found. Default:
data_extractor.utils.sentinel.
- Returns:
Data or subelement.
- Return type:
Any
- Raises:
ExtractError – Thrown by extractor extracting wrong data.
- class data_extractor.lxml.XPathExtractor(expr: str)¶
Bases:
AbstractSimpleExtractorUse XPath for XML or HTML data extracting.
Before extracting, should parse the XML or HTML text into
data_extractor.lxml.Elementobject.- Parameters:
expr – XPath Expression.
- extract(element: _Element) List[_Element] | List[str]¶
Extract subelements or data from XML or HTML data.
- Parameters:
element (
data_extractor.lxml.Element) – Target.- Returns:
List of
data_extractor.lxml.Elementobjects, List of str, or str.- Return type:
list
- Raises:
data_extractor.exceptions.ExprError – XPath Expression Error.
- extract_first(element: Any, default: Any = sentinel) Any¶
Extract the first data or subelement from extract method call result.
- Parameters:
element (Any) – The target data node element.
default (Any, optional) – Default value when not found. Default:
data_extractor.utils.sentinel.
- Returns:
Data or subelement.
- Return type:
Any
- Raises:
ExtractError – Thrown by extractor extracting wrong data.