How do I automate dataset updates in the Open Data Portal?
Automating your dataset updates is particularly useful where you intend to update your data frequently (i.e. daily or weekly) and/or where your dataset is large (e.g. > 250,000 rows or 150 MB). If you are interested in setting this up, complete the request form. To complete this form, you will need to:
- Know your dataset ID. The dataset id is the alphanumeric sequence – four characters, a dash, and four additional characters found at the end of the dataset’s URL (e.g. https://data.iowa.gov/Health/Monthly-Medicaid-Payments-By-Vendor/b3t9-awkp)
- Know how to identify the API field names for columns of data in your dataset. You will need to know this for any columns that serve as a row identifier, the location field or source columns for the location field. The API field name can be found by hovering mouse over column information icon, as shown in image to the right. As shown in far right image, to quickly access a full list of the API field names in the dataset, click “Export” then “SODA API”, the complete list is shown under Field Names heading on the right side (highlighted). Column labels are to the left. If the API field names are truncated - which is possible for long names, click the API Docs button and scroll to the Fields section. The API Field Name will be the first item listed in the associated table.
- Decide on the type of update required. There are three choices for updates:
- Append will append the data contained in the update file to the data in the dataset.
- Replace will replace the data in the dataset with the data contained in the update file.
- Upcert will append rows in the update file that are not currently in the dataset, and replace those that are. Row identifiers are required to utilize the upcert option.
|Program Area|| |
Open Data, Update Datasets