The AVS addresses are imported into EAS via a one time ETL.
Once EAS goes live, this ETL will be turned off.
Here I describe in general the processing that occurs during the ETL.You can see the most of the code at these URLs
The ETL process is controlled by a python job which is here (jira access required)
If you do not have access, the important code is here
Code Block |
---|
job.commands = [
# load the data
FtpCommand(job=job, connectionGroup=self.connectionOrganizer.ftpConnections['DBI2SFGIS'], sourcePath='/dbi-external/tblAVS.txt', targetPath=csvFileIn, direction='GET'),
FmeCommand(job=job, fmw="avs_load/avs_to_eas.fmw", connectionGroups=(self.connectionGroups['MAD_ETL'],), useSdePorts=(False,), files=(csvFileIn,)),
FmeCommand(job=job, fmw="avs_load/parcels_sfgisStg_no_geom_to_eas.fmw", connectionGroups=(self.connectionGroups['SFGIS_STG_SDE'], self.connectionGroups['MAD_ETL'],), useSdePorts=(True,False,)),
SqlCommand(job=job, sqlStatement="select avs.delete_addresses();", connectionGroup=self.connectionGroups['MAD_ETL']),
SqlCommand(job=job, sqlStatement="vacuum analyze;", connectionGroup=self.connectionGroups['MAD_ETL'], useTransaction=False),
SqlCommand(job=job, sqlStatement="select avs.init_parcel_points();", connectionGroup=self.connectionGroups['MAD_ETL']),
SqlCommand(job=job, sqlStatement="select avs.init_streets_nearest();", connectionGroup=self.connectionGroups['MAD_ETL']),
SqlCommand(job=job, sqlStatement="select avs.init_avs_addresses();", connectionGroup=self.connectionGroups['MAD_ETL']),
AvsLoadCommand(job=job, connectionGroup=self.connectionGroups['MAD_ETL']),
SqlCommand(job=job, sqlStatement="update avs.avs_addresses set exception_text = 'programming error - row not processed' where address_base_id is null and exception_text is null;", connectionGroup=self.connectionGroups['MAD_ETL']),
SqlCommand(job=job, sqlStatement="select public._eas_validate_addresses_after_etl('ALL');", connectionGroup=self.connectionGroups['MAD_ETL']),
SqlCommand(job=job, sqlStatement="delete from public.d_address_disposition where disposition_description = 'provisional';", connectionGroup=self.connectionGroups['MAD_ETL']),
SqlCommand(job=job, sqlStatement="vacuum analyze;", connectionGroup=self.connectionGroups['MAD_ETL'], useTransaction=False),
# report on results
RemoveDirectoryTreeCommand(job=job, targetDir=os.path.join(getDataFilePath(), 'avs_load_report')),
SqlCommand(job=job, sqlStatement="select avs.qa();", connectionGroup=self.connectionGroups['MAD_ETL']),
DumpTablesCommand(job=job, connectionGroup=self.connectionGroups['MAD_ETL'], tableNames=['avs.vw_load_summary', 'avs.vw_load_results', 'avs.qa'], targetDir=reportDir),
ZipCommand(job=job, sourceDir=os.path.join(getDataFilePath(), 'avs_load_report'), targetFile=os.path.join(getDataFilePath(), 'avs_load_report.zip')),
EmailCommand(job=job, subject='AVS Load Report', text='See attached.', attachFiles=[os.path.join(getDataFilePath(), 'avs_load_report.zip')]),
FtpCommand(job=job, connectionGroup=self.connectionOrganizer.ftpConnections['DBI2SFGIS'], sourcePath=os.path.join(reportDir, 'avs.vw_load_results.csv'), targetPath='/dbi-external/avsEasLoadResults.csv', direction='PUT')
]
|
The majority of the other code is here
with these being the key pieces
- http://code.google.com/p/eas/source/browse/trunk/etl/sql/avs_load/f_init_avs_addresses_ddl.sql
- http://code.google.com/p/eas/source/browse/trunk/etl/sql/avs_load/f_load_ddl.sql
- http://code.google.com/p/eas/source/browse/trunk/etl/src/commands.py#1799sql/avs_load/f_process_address_base_ddl.sql
- http://code.google.com/p/eas/source/browse/trunk/etl/sql/avs_load/f_process_address_unit_ddl.sql
- http://code.google.com/p/eas/source/browse/#svn%2Ftrunk%2Fetl%2Fsql%2Favs_load
...
- /trunk/etl/sql/avs_load/f_process_address_parcel_link_ddl.sql
- http://code.google.com/p/eas/source/browse/trunk/etl/src/commands.py#1799
The execution path is not trivial.
In any case here I walk through the process below mostly in english.
I will try to call out the places where I have to generalize.
...