geo_utilities.py

Module that collects variety utility functions for GeocoderPL project

geo_utilities.calc_pnt_dist(c_paths, x_val, y_val, wrld_pl_trans)

Function that calculates distances of point to given polygon

Parameters:
  • c_paths (List[Path]) – List containing matplotlib paths of regions

  • x_val (float) – Longitude of a given address point

  • y_val (float) – Latitude of a given address point

  • wrld_pl_trans (CoordinateTransformation) – Coordinates transformation that transforms spatial references from EPSG 4326 to EPSG 2180

Return type:

float

Returns:

Distance from a givent address point do closest polygon

geo_utilities.clear_xml_node(curr_node)

Function that clears unnecessary XML nodes from RAM memory

Parameters:

curr_node (Element) – Current XML node

Return type:

None

Returns:

The method does not return any values

geo_utilities.convert_coords(all_coords, in_system, out_system)

Function that converts multiple coordinates between given systems

Parameters:
  • all_coords (Union[ndarray, List[List[float]]]) – Numpy array / list contining coordinates that should be transformed from one EPSG coordinates system to the other

  • in_system (str) – Input coordinates system (EPSG string number)

  • out_system (str) – Output coordinates system (EPSG string number)

Return type:

Transformer

Returns:

Transformation object that converts coordinates from one EPSG system (in_system) to the other (out_system)

geo_utilities.create_coords_transform(in_epsg, out_epsg, change_map_strateg=False)

Function that creates object that transforms geographical coordinates

Parameters:
  • in_epsg (int) – Number of input EPSG coordinates system

  • out_epsg (int) – Number of output EPSG coordinates system

  • change_map_strateg (bool) – Flag indicating if map strategy should be changed

Return type:

CoordinateTransformation

Returns:

Coordinates transformation that transforms spatial references from input EPSG system to output EPSG system

geo_utilities.create_logger(name)

Function that creates logging file

Parameters:

name (str) – Name of logger

Return type:

Logger

Returns:

Logger object

geo_utilities.csv_to_dict(c_path)

Function that imports CSV file and creates dictionairy from first two columns of that file

Parameters:

c_path (str) – Path of the CSV file that should be read to dictionary

Return type:

Dict[str, str]

Returns:

Dictionary read from CSV file

geo_utilities.fill_regs_tables()

Function that fills tables with parameters of regions shapes

Return type:

None

Returns:

The method does not return any values

geo_utilities.gen_fin_bubds_ids(c_coords, c_len, top_geojson, top_ids, bdot10k_dist, bdot10k_ids, crds_inds, pow_bubd_arr, dod_opis_list, addr_phrs_list, addr_phrs_len, c_addr_phrs_uniq, wrld_pl_trans)

Function that finds closest buidling shape for given PRG point

Parameters:
  • c_coords (ndarray) – Numpy array containing all address points in given sector

  • c_len (int) – Numper of current address points

  • top_geojson (ndarray) – Numpy array containing top “n” BDOT10k buildinigs located closest to given address point

  • top_ids (ndarray) – Numpy array containing IDs of top “n” BDOT10k buildinigs located closest to given address point

  • bdot10k_dist (ndarray) – Numpy arrray cointaining distance of a given address point to closest building from BDOT10k database

  • bdot10k_ids (ndarray) – Numpy array containing IDs of buildings from BDOT10k database

  • crds_inds (ndarray) – Numpy array containng BDOT10k buildings indices for given sector

  • pow_bubd_arr (ndarray) – Numpy array containing information about all BDOT10k buildings in current sector

  • dod_opis_list (ndarray) – Numpy array containing additional descriptions of an address point

  • addr_phrs_list (List[str]) – List containing address points phrases

  • addr_phrs_len (int) – Length of address points phrases list

  • c_addr_phrs_uniq (str) – Current unique addresses string

  • wrld_pl_trans (CoordinateTransformation) – Coordinates transformation that transforms spatial references from EPSG 4326 to EPSG 2180

Return type:

Tuple[str, str]

Returns:

  • c_adr_phr (str) - current addresses phrase

  • c_addr_phrs_uniq (str) - current unique addresses string

geo_utilities.get_bdot10k_id(curr_coords, coords_inds, bdot10k_ids, bdot10k_dist, dod_opis_list, addr_phrs_list, addr_phrs_len, wrld_pl_trans, addr_phrs_uniq, sekts_arr, sekts_ids, pow_bubd_all, sekt_addr_phrs)

Function that returns id and distance of polygon closest to PRG point

Parameters:
  • curr_coords (ndarray) – Numpy array containing all address points in region

  • coords_inds (ndarray) – Numpy array containng BDOT10k buildings indices

  • bdot10k_ids (ndarray) – Numpy array containing IDs of buildings from BDOT10k database

  • bdot10k_dist (ndarray) – Numpy arrray cointaining distance of a given address point to closest building from BDOT10k database

  • dod_opis_list (ndarray) – Numpy array containing additional descriptions of an address point

  • addr_phrs_list (List[str]) – List containing address points phrases

  • addr_phrs_len (int) – Length of address points phrases list

  • wrld_pl_trans (CoordinateTransformation) – Coordinates transformation that transforms spatial references from EPSG 4326 to EPSG 2180

  • addr_phrs_uniq (str) – Unique addresses string

  • sekts_arr (ndarray) – Numpy array contaning sectors of address points

  • sekts_ids (ndarray) – Numpy array containing indices of sectors

  • pow_bubd_all (ndarray) – Numpy array containing information about all BDOT10k buildings in current region

  • sekt_addr_phrs (ndarray) – Numpy array containing sectors of address points

Return type:

str

Returns:

Unique addresses string

geo_utilities.get_corr_reg_name(curr_name)

Function that corrects wrong regions names

Parameters:

curr_name (str) – Current region name

Return type:

str

Returns:

Corrected region name

geo_utilities.get_osm_coords(address, outside_pts, c_paths, popraw_list, c_ind, coord1, coord2, dists_list, zrodlo_list, wrld_pl_trans)

Function that returns OSM coordinates of address point or distance from the district shapefile

Parameters:
  • address (str) – Address string

  • outside_pts (ndarray) – Numpy array of address points identified as beeing outsiode of given region border

  • c_paths (List[Path]) – List containing matplotlib paths of regions

  • popraw_list (List[int]) – List containing flags indicating if a given address point is valid

  • c_ind (int) – Current index of a given address point

  • coord1 (float) – Longitude of a given address point

  • coord2 (float) – Latitude of a given address point

  • dists_list (List[float]) – List cointaining distance of a given address point to its municipility border

  • zrodlo_list (List[str]) – List containing names of the source of a given address point

  • wrld_pl_trans (CoordinateTransformation) – Coordinates transformation that transforms spatial references from EPSG 4326 to EPSG 2180

Return type:

None

Returns:

The method does not return any values

geo_utilities.get_region_shapes()

Function that creates shapes for each regions

Return type:

Dict[str, Geometry]

Returns:

Ordered dictionary containing shapes of regions

geo_utilities.get_sector_codes(poly_centr_y, poly_centr_x)

Function that returns sector code for given coordinates

Parameters:
  • poly_centr_y (ndarray) – Numpy array containing latitudes of address points

  • poly_centr_x (ndarray) – Numpy array containing longitudes of address points

Return type:

Tuple[ndarray, ndarray]

Returns:

  • c_sekt_szer (np.ndarray) - rows indices of sectors for given coordinates

  • c_sekt_dl (np.ndarray) - columns indices of sectors for given coordinates

geo_utilities.get_sectors_params()

Funtion that calculates basic parameters of sectors

Return type:

Tuple[float, float, int, int]

Returns:

  • sekt_szer (float) - width of sectors

  • sekt_dl (float) - height of sectors

  • plnd_min_szer (int) - min latitude of Poland

  • plnd_min_dl (int) - min longitude of Poland

geo_utilities.get_super_permut_dict(max_len)

Function that creates indices providing superpermutations for lists of strings with length of maximum 5 elements

Parameters:

max_len (int) – Maximum length of superpermutation

Return type:

Dict[int, List[int]]

Returns:

Dictionary containing superpermutation indices

geo_utilities.points_in_shape(c_paths, curr_coords)

Checking if point lies inside shape of district

Parameters:
  • c_paths (List[Path]) – List containing matplotlib paths of regions

  • curr_coords (ndarray) – Numpy array containing all address points in region

Return type:

ndarray

Returns:

Numpy array of flags indicating if given address point is inside given region shape

geo_utilities.points_inside_polygon(grouped_regions, woj_name, trans_crds, points_arr, popraw_list, dists_list, zrodlo_list, bdot10k_ids, bdot10k_dist, sekt_kod_list, dod_opis_list, addr_phrs_list, addr_phrs_len, teryt_arr, json_arr, wrld_pl_trans, sekt_addr_phrs)

Function that checks if given points are inside polygon of their districts and finds closest building shape for given PRG point

Parameters:
  • grouped_regions (Dict[Hashable, ndarray]) – Regions dictionary grouped by district and municipality name

  • woj_name (str) – Name of the province

  • trans_crds (ndarray) – Numpy array containing transformed coordinates of address points

  • points_arr (ndarray) – Numpy array containing coordinates of address points

  • popraw_list (List[int]) – List containing flags indicating if a given address point is valid

  • dists_list (List[float]) – List cointaining distance of a given address point to its municipility border

  • zrodlo_list (List[str]) – List containing names of the source of a given address point

  • bdot10k_ids (ndarray) – Numpy array containing IDs of buildings from BDOT10k database

  • bdot10k_dist (ndarray) – Numpy arrray cointaining distance of a given address point to closest building from BDOT10k database

  • sekt_kod_list (ndarray) – Numpy array containing sector codes of address points

  • dod_opis_list (ndarray) – Numpy array containing additional descriptions of an address point

  • addr_phrs_list (List[str]) – List containing address points phrases

  • addr_phrs_len (int) – Length of address points phrases list

  • teryt_arr (ndarray) – Numpy array containing TERYT codes od address points

  • json_arr (ndarray) – Numpy array containing GeoJSON shapes

  • wrld_pl_trans (CoordinateTransformation) – Coordinates transformation that transforms spatial references from EPSG 4326 to EPSG 2180

  • sekt_addr_phrs (ndarray) – Numpy array containing sectors of address points

Return type:

None

Returns:

The method does not return any values

geo_utilities.reduce_coordinates_precision(geojson_poly, precision)
Function that reduce decimal precision of coordinates in GeoJSON file:
  • 0 decimal places is a precision of about 111 km

  • 1 decimal place is a precsion of about 11 km

  • 2 decimal places is a precison of about 1.1 km

  • 3 decimal places is a precison of about 111 m

  • 4 decimal places is a precison of about 11 m

  • 5 decimal places is a precison of about 1.1 m

  • 6 decimal places is a precison of about 11 cm

  • 7 decimal places is a precison of about 1.1 cm

Parameters:
  • geojson_poly (str) – GeoJSON string representing polynomial shape

  • precision (int) – The precision to which the accuracy of the coordinates should be reduced

Return type:

str

Returns:

Final GeoJSON string with reduce precision

geo_utilities.time_decorator(func)

Decorator that logs information about time of function execution

Parameters:

func – Function call that should be wrapped

Return type:

Callable

Returns:

Time wrapper function call