xml_parsers.py

XML Parsers module of the GeocoderPL project

class xml_parsers.BDOT10kDataParser(xml_path, tags_tuple, event_type, dicts_tags, tags_dict)

Bases: XmlParser

BDOT10kDataParser class

__init__(xml_path, tags_tuple, event_type, dicts_tags, tags_dict)

Method that creates objects from a class “BDOT10kDataParser”

Parameters:
  • xml_path (str) – Path of a given XML file

  • tags_tuple (Tuple[str, ...]) – Tuple containig XML tags

  • event_type (str) – Type of event in XML file

  • dicts_tags (Dict[str, str]) – XML tags of BDOT10k dicts

  • tags_dict (Dict[str, int]) – Tags dicts of BDOT10k buildings

Returns:

The method does not return any values

check_path()

Method that checks if path to file is valid

Return type:

None

Returns:

The method does not return any values

parse_bdot10k_xml(xml_contex, fin_row)

Method that exctrats data from BDOT10k XML file

Parameters:
  • xml_contex (iterparse) – Root of XML data tree

  • fin_row (List[Any]) – List containing information on a single building from the BDOT10k database

Return type:

List[List[Any]]

Returns:

List contsining data extracted from BDOT10k database

parse_xml()

Method that parses XML file and saves data to SQL database

Return type:

None

Returns:

The method does not return any values

class xml_parsers.BDOT10kDictsParser(xml_path, tags_tuple, event_type)

Bases: XmlParser

BDOT10kDictsParser class

__init__(xml_path, tags_tuple, event_type)

Method that creates objects from a class “BDOT10kDictsParser”

Parameters:
  • xml_path (str) – Path of a given XML file

  • tags_tuple (Tuple[str, ...]) – Tuple containig XML tags

  • event_type (str) – Type of event in XML file

Returns:

The method does not return any values

check_path()

Method that checks if path is valid

Return type:

None

Returns:

The method does not return any values

get_bdot10k_dicts()

Method that returns final BDOT10k dicts

Return type:

Dict[Any, Any]

Returns:

Method that returns final BDOT10k dicts

parse_xml()

Method that parses XML file to dictionairy object

Return type:

None

Returns:

The method does not return any values

class xml_parsers.PRGDataParser(xml_path, tags_tuple, event_type, perms_dict)

Bases: XmlParser

PRGDataParser class

__init__(xml_path, tags_tuple, event_type, perms_dict)

Method that creates objects from a class “PRGDataParser”

Parameters:
  • xml_path (str) – Path of a given XML file

  • tags_tuple (Tuple[str, ...]) – Tuple containig XML tags

  • event_type (str) – Type of event in XML file

  • perms_dict (Dict[int, List[int]]) – Dictionary containing superpermutation indices

Returns:

The method does not return any values

check_path()

Method that checks if path to file is valid

Return type:

None

Returns:

The method does not return any values

check_prg_pts_add_db(points_arr, woj_name, teryt_arr, json_arr, wrld_pl_trans, sekt_addr_phrs)

Function that converts spatial reference of PRG points from 2180 to 4326, checks if given PRG point belongs to shapefile of its district and finds closest building shape for given PRG point

Parameters:
  • points_arr (ndarray) – Numpy array containing all address points in a given province

  • woj_name (str) – Name of the province

  • teryt_arr (ndarray) – Numpy array containing TERYT names and TERYT codes

  • json_arr (ndarray) – Numpy array containing JSON shapefiles

  • wrld_pl_trans (CoordinateTransformation) – Coordinates transformation that transforms spatial references from EPSG 2180 to EPSG 4326

  • sekt_addr_phrs (ndarray) – Numpy array containing address phrases

Return type:

None

Returns:

The method does not return any values

create_points_list(xml_contex)

Creating list of data points

Parameters:

xml_contex (iterparse) – Root of XML data tree

Return type:

List[List[str]]

Returns:

List containing lists of address points

parse_xml()

Method that parses XML file and saves data to SQL database

Return type:

None

Returns:

The method does not return any values

class xml_parsers.XmlParser(xml_path, tags_tuple, event_type)

Bases: ABC

Parent XML parsers class

__init__(xml_path, tags_tuple, event_type)

Abstract method that creates objects from a class “XmlParser”

Parameters:
  • xml_path (str) – Path of a given XML file

  • tags_tuple (Tuple[str, ...]) – Tuple containig XML tags

  • event_type (str) – Type of event in XML file

Returns:

The method does not return any values

__weakref__

list of weak references to the object (if defined)

abstract check_path()

Abstract method that checks if path is valid

Return type:

None

Returns:

The method does not return any values

property get_xml_path: str

Abstract method that returs path of a given XML file

Returns:

Path of a given XML file

abstract parse_xml()

Abstract method that parses XML file

Return type:

None

Returns:

The method does not return any values

xml_parsers.read_bdot10k_dicts()

Function that reads to RAM BDOT10k dictionaries

Return type:

Dict[str, Dict[str, ndarray]]

Returns:

Dictionary containing BDOT10k dictionaries