joseph.leach 18d9c90161 | 5 years ago | |
---|---|---|
.gitignore | 5 years ago | |
Makefile | 5 years ago | |
Pipfile | 5 years ago | |
Pipfile.lock | 5 years ago | |
README.md | 5 years ago | |
eu-londoners.csv | 5 years ago | |
eu-londoners.db | 5 years ago | |
index.html | 5 years ago | |
spatialite.py | 5 years ago |
this repo is based on csvs-to-sqlite
and datasette
- see the Pipfile for the dependencies
initially this was done with the arcgis-rest-js
module in the joelondon/csv-geoding observable notebook, although desktop matching with arcmap was ultimately more successful
pipenv is a little bit easier than venv and works similarly to npm
run pipenv install
and pipenv shell
to access a virtual environment with the main dependencies installed
checking csvs for quirks in csvlint.io, as well as being able to download cleaned versions, is a good way of cleaning data
data-curator, by the same people, is a desktop app that also helps to validate csv, it is also capable of opening ms office spreadsheets and creating csv schemas
csvs-to-sqlite
creates sqlite databases from csv, ideally one that has passed validation at https://csvlint.io
the below command shows how a csv is turned into sqlite with full text search enabled on named fields:
csvs-to-sqlite eu-londoners.csv eu-londoners.db -f Borough -f Organisation -f Address -f Website -f "Services offered " -f "Target Beneficiary (Demographic) - Other"
datasette
has the ability to dockerise and deploy to cloud services zeit and heroku for web access, here is a zeit.now container:
note if a csv has latitude and longitude fields, and the datasette has datasette-cluster-map
enabled, datasette plots the records on a leaflet map
datasette publish now \
--spatialite \
--install=datasette-cluster-map \
--alias="eu-londoners.now.sh" \
eu-londoners.db
cloud services are opaque, if you find yours is sleeping (i think to save resources on the actual machine(s) your app ends up on) you may need to specify a minimum scale with the following command:
now scale eu-londoners.now.sh bru 1 auto
there is a Makefile to preset some commands so you can just type make build
etc. instead of the full commands - the field labels may affect build though as several of these are hardcoded into the full text search build in csvs-to-sqlite
:
build Pre-build the required assets clean Clean any previously built db deploy Deploy to now help Display this help message serve Serve the dataset in your browser
there is index.html
in this project which uses datasette as a web api to full text search for finding data, the page also plots records with latitude/longitude on a leaflet map
there is spatialite.py adapted from the docs which adds a geometry column, and a spatial index for bounding box queries. it works pretty fast, e.g. one or two milliseconds for this AsGeoJSON query!
import sqlite3
conn = sqlite3.connect('eu-londoners.db')
# Lead the spatialite extension:
conn.enable_load_extension(True)
conn.load_extension('/usr/local/lib/mod_spatialite.dylib')
# Initialize spatial metadata for this database:
conn.execute('select InitSpatialMetadata(1)')
# Add a geometry column called point_geom to our table:
conn.execute("SELECT AddGeometryColumn('eu-londoners', 'point_geom', 4326, 'POINT', 2);")
# Now update that geometry column with the lat/lon points
conn.execute('''
UPDATE [eu-londoners] SET
point_geom = GeomFromText('POINT('||"longitude"||' '||"latitude"||')',4326);
''')
# If you don't commit your changes will not be persisted:
conn.commit()
conn.close()