« Back to Canada Gazetteer
Access -- Canada Gazetteer
Generate an API token, a string of 40 alpha-numeric characters. It is a PASSWORD that you should keep for your own or share with only those who you trust.
With credits in your account and API token ready, you can start querying the dataset:
Copy and enter the above URL address in your browser address bar and you will be on your way.
401 Unauthorized ?
No surprise. That's because all API requests require an API token. Attach the token, e.g.
XXXX, as a URL parameter like this:
Replace XXXX with your own API token and you should be seeing the index of the dataset, a list of available data tables.
Query a data table
Dataset (database) is comprised of one or more data tables. When you know the available tables in a dataset, you can query any of them, e.g. `city`, by:
Which will select the first rows of this table. If results are in JSON, follow
links.next.href for the next page / batch of rows.
You can specify which rows to select by certain criteria, known as WHERE clause in relational database queries. For example, for all `city` rows with `id` < 25:
Or for city.id between 90 and 150, inclusive:
Find out rows with column `field_1` being empty, `field_2` being "John Doe", and `field_3` being not "female":
To search for a specific phrase against a table field, use the
-LIKE- comparison operator in the WHERE clause via the online query tool, like this:
Which would return all movies with title containing the phrase "the empire", like "End of The Empire War", "The Empire Lost", etc. However, a search phrase starting with
% is very likely very expensive in performance thus costing more Credits. A better approach if possible is to not use the leading
-LIKE- searches, like this:
Which would only search for movie titles starting with the phrase "the empire", like "The Empire Lost", "The Empires in History", etc.
LIMIT rows per page
Sometimes it makes sense to ask for more rows per page, or less. To do this, you need the limit parameter in the query URL:
Which will select and return 200 rows at most per page. Allowed limits: 1, 10, 50, 100, 200, 500, 1000, 2000, 5000, 10000. Default limit: 100.
FIELDS (columns) to select
You can use fields parameter in the URL address to specify which columns / fields to select:
Which will select the columns `title` and `slug` of table `city`.
To select ALL columns / fields, use
If no fields are specified in the request URL, however, only the `id` and `ts` columns will be returned. This is so that you can check if a particular row (by `id`) has been updated (`ts` changed) since your last retrieval and decide to fetch it again (for the updates) or not.
JSON is the default format of query results. To ask for a specific format for returned results, use the app parameter in request URL:
Which will return the results in JSON. For now, we support:
- json (JSON)
- csv (CSV)
Json is both readable by end users and programmable by developers. Csv (Comma-separated Values) can be well handled by spreadsheet programs such as Excel, e.g.
Which will simply display the data results as CSV in your browser. To force download of the results into a file named
yourfile.csv, use this:
More formats are in the to-do list: excel, mysql, mssql, access, pdf, etc.
COUNT total rows
A query returns only a partial set (called page or batch) of ALL rows that match your request. For instance, the total number of rows matching the query may be 1,500, with each returned page giving 200 rows, you will go through 8 sequential pages to acquire all matching rows.
Since counting all potential rows for a given query is resource-intensive, you will have to explicitly ask for it if you need it, by attaching the count parameter in the request URL for JSON results:
Now you should be seeing the total number of all potential rows in
meta.stats.rowsTotal which by default remains NULL.
Note COUNT can be an expensive action that charges you by the time it needs for counting. It would be wise to use this only against small set of rows.
JOIN of tables
It's useful to have multiple tables joined together in a single query so we don't have to query each of the tables in separate queries. However, NOT all tables can be joined together. Only related tables can be joined in one query.
For instance, in a movies database, `movie` table is related to `director` table and vice versa since `movie` table has a field `director_id`, so the JOIN query of both tables looks like this:
To search for all movies by James Cameron, we have:
Further, to find movies directed by James Cameron and that have Leonardo DeCaprio as an actor, we will join yet another 2 tables, `actor` and `movie_x_actor` (with relational fields `movie_id` and `actor_id`):
Order of tables do matter here. In a joined / chained query of multiple tables, each table must have at least one immediately or directly related table to its left. In the example above we have:
Table `actor` is related to `movie_x_actor` at the left. Table `movie_x_actor` is related to table `movie` at the left. Table `director` has a related table `movie` at the left. So there you go, we have a collection of correctly joined tables in proper order.
However, would this work?
No, because there are NO tables related to `actor` at its left. Neither `director` nor `movie` is immediately related to table `actor`. So an error will be given in this regard:
Would this work?
Yes, since every table has a related table at its left.
Switch the positions of `director` and `movie` and we have:
Would it work? No. Because table `director` finds no related table at its left.
Stay tuned for more docs (Coming Soon)
Read tutorials and docs to become a Datactory expert and integrate Datactory datasets to any project without having to worry about data updates and synchronization again.