- Erstellt von Mandana Moshrefzadeh am 16.Februar 2021
CKAN is a complex software working with some many other web services. Hence, creating the docker version was not straightforward and needed much troubleshooting. All the useful links are provided here for a furthure study:
- CKAN help:
- Docker:
- command lines: https://docs.docker.com/engine/reference/commandline/docker/
- docker network: https://www.freecodecamp.org/news/how-to-get-a-docker-container-ip-address-explained-with-examples/
- https://docs.docker.com/compose/networking/
- https://stackoverflow.com/questions/58073936/how-to-get-ip-address-of-docker-desktop-vm
- ckan docker - official:
- SDDI Docker:
- Other Ckan docker:
- Stable: https://github.com/keitaroinc/docker-ckan
- https://github.com/ckan/ckan-docker
- https://github.com/okfn/docker-ckan
- https://github.com/kowh-ai/ckan-docker-travis
- https://github.com/eccenca/ckan-docker
- https://github.com/UKHomeOffice/docker-ckan
- Instruction: https://herrmann.tech/en/blog/2020/09/30/how-to-install-and-configure-ckan-2-9-0-using-docker.html
- Docker - DB:
Connect From Your Local Machine to a PostgreSQL Database in Docker
- https://stackoverflow.com/questions/37694987/connecting-to-postgresql-in-a-docker-container-from-outside
How to Restore Database Dumps for Postgres in Docker Container: https://simkimsia.com/how-to-restore-database-dumps-for-postgres-in-docker-container/
How to List Databases and Tables in PostgreSQL Using psql: https://chartio.com/resources/tutorials/how-to-list-databases-and-tables-in-postgresql-using-psql/
- Database Dump inside the docker:
- Some usefull discussions:
- FLASK & Front-end web server:
- Installing Docker and running ckan on linux:
- https://serverfault.com/questions/715905/why-am-i-getting-an-invalid-command-proxypass-error-when-i-start-my-apache-2/715906
- https://www.digitalocean.com/community/tutorials/how-to-use-apache-http-server-as-reverse-proxy-using-mod_proxy-extension
- https://www.linode.com/community/questions/311/how-do-i-enabledisable-a-website-hosted-with-apache
- https://askubuntu.com/questions/629995/apache-not-able-to-restart
CKAN Services:
- Datastore: https://docs.ckan.org/en/2.9/maintaining/datastore.html
- Datapusher: https://github.com/ckan/datapusher#datapusher
- Database management Ckan: https://docs.ckan.org/en/2.8/maintaining/database-management.html#db-upgrade
THREAD
Mandana Moshref
@MandanaMoshref
Dec 01 11:33
Hello,
I have a question regarding the docker installation. It seems the docker provided by ckan itself is only for the dev purpose ( neither nginx nore apache is included in the Dockerfile /or docker-compose.yml).
I need to have the docker version of our own ckan implementation which is using the ckan core docker including some revision to fit our requirements.
I made the changes and it works on my personal machine. Now I need to move it into our server for production-ready use. My question is: does it make sense if I run it as the localhost on my docker server and then use apache server reverse proxy to reverse the localhost:5000 into the https://my-website.com?
Brett
@kowh-ai
Dec 01 13:26
Yes that makes sense - you could include a NGINX docker image/container configuration in your docker-compose.yml file making sure your NGINX configuration contains a proxy_pass line to proxy requests to the CKAN Docker container eg: proxy_pass http://ckan:5000/;
mabah-mst
@mabah-mst
Dec 01 13:33
I have had the same concerns with the docker installation process that is described in the documentation. I am working on deploying a setup using the docker-compose setup from https://github.com/keitaroinc/docker-ckan .
Mandana Moshref
@MandanaMoshref
Dec 01 13:59
@kowh-ai thanks for the reply.
Is it a recommendation to include ngnix or requirements?
If I go ahead without including NGNIX docker what will happen?
Two more points for clarification: 1) my ckan version is 2.8.0 2) Apache server is not a docker installation.
Brett
@kowh-ai
Dec 01 14:34
@MandanaMoshref I'd always recommend to have some sort of HTTP server on the front (especially for Prod). It would be a simple NGINX docker container configuration. Running a front-end web/proxy server outside of Docker may cause you some grief (especially with networking) unless you are comfortable working with infrastructure... @mabah-mst
yes the keitaroinc setup is certainly more complete than the current CKAN docker one.
Mandana Moshref
@MandanaMoshref
Dec 03 20:44
@kowh-ai Thanks a lot Brett for your advice. I have one more question.
I did it as you suggested including the NGINX docker and then I used apache reverse proxy and the SSL certificate to secure it. I am just wondering whether it is better to include the security certificate in the NGINX container or is it fine also to include it by the apache server?
Thanks again.
Brett
@kowh-ai
Dec 04 10:14
Oh so your setup is along these lines: user —SSL—> Apache Web Server —non-SSL—> NGINX(Container) —non-SSL—> CKAN(Container)
?
Mandana Moshref
@MandanaMoshref
Dec 04 10:16
yes... but just because It has worked in this way. But honestly have no idea how bad or O.K. is my approach
Brett
@kowh-ai
Dec 04 15:01
Assuming the Apache web server and containers are all running on the same machine (you mentioned moving to your production-ready server previously) I would try and simplify by just taking out the Apache web server from the configuration as it’s not really needed. You could update the NGINX container to listen via an SSL port. The non-Docker CKAN “Deploying a source install” instructions (https://docs.ckan.org/en/latest/maintaining/installing/deployment.html) include the NGINX web server as just a reverse proxy to a WSGI Server (which is the Apache Server replacement with CKAN 2.9). With Docker containers, (I think) you don’t need to worry about the WSGI server as the running CKAN container exposes the 5000
port. I hope that isn’t confusing and hasn’t given you a lot more work to do…
CKAN itself has several docker installation repositories (also referenced in "Useful links" above.).
Our installation is mainly based on the official one from CKAN with improvements inspired by CKAN Keitronic Docker.
Here are the details regarding the design, process, installation and operation of docker HEF AgriHUB on the HEF server hosted by LRZ:
For this docker we use docker compose.
All the relevant files for creating the HEF AgriHUB is presented in the below graph:
Main folder
|
---|
docker folder
docker-compose.yml: inside this folder running "docker-compose up -d --build" reads this file. Here we have defined 6 services ( for our implementation nginx is deactivated due to port conflict with port 80 and instead we use apache server on the host machin) After this step, CKAN should be running at There should be five containers running (
the Postgres container could need longer to initialize the database cluster than the ckan container will wait for. This time span depends heavily on available system resources. If the CKAN logs show problems connecting to the database, restart the ckan container a few times: docker-compose restart ckan docker ps | grep ckan docker-compose logs -f ckan There should be four named Docker volumes (
Docker structure
For each sevice, the build folder is provided and can be studied. |
---|
CKAN container
|
---|
CKAN extension
Here is a list of all required extensions. Detailed explaination of these extensions are provided under the page Developer Guideline: HEF CKAN (The Comprehensive Knowledge Archive Network) RUN . $CKAN_VENV/bin/activate && $CKAN_VENV/bin/pip install ckantoolkit RUN . $CKAN_VENV/bin/activate && $CKAN_VENV/bin/pip install -e git+https://github.com/ckan/ckanext-dcat.git#egg=ckanext-dcat RUN . $CKAN_VENV/bin/activate && $CKAN_VENV/bin/pip install -e git+https://github.com/datopian/ckanext-gdpr.git#egg=ckanext-gdpr RUN . $CKAN_VENV/bin/activate && $CKAN_VENV/bin/pip install /usr/lib/ckan/default/src/ckanext-spatial RUN . $CKAN_VENV/bin/activate && $CKAN_VENV/bin/pip install /usr/lib/ckan/default/src/ckanext-harvest |
---|
data & backup files
|
---|
HEF AgriHUB docker
By running /SetupCKANDocker,
- It first goes to the folder contrib/docker to read the "docker-compose.yml and start creating the services (described in the previous part)
- Copy the whole HEP folder into ckan container
- Work with db container and doing:
- add postgis extension
- add spatial ref systems
- alter the view geometry_columns ownership to ckan user
- alter the view spatial_ref_sys ownership to ckan user
- Copy agrihub.dump into ckan container
- Use ckan CLI command to clean the CKAN db
- Restore agrihub.dump file (copy the dump file) inside the fresh installed ckan/HEF AgriHUB
- Remove agrihub.dump file
- Repeat step 4 for datastore
- Repeat step 6 for datastore
- Remove datastore.dump file
- Rebuild solr index using CKAN CLI command
- Change access permission for the folder storing the upload files to ckan
- Set required permission and grants for datastore database
CKAN CLI
In order to learn more and be able to work with CKAN CLI refer to this documentation: https://docs.ckan.org/en/2.9/maintaining/cli.html
Apache server as a reverse proxy
At this stage, the HEf AgriHUB is running and can be accessed by the ip address and port 5000 (ip:5000)
However, we would like to have it running it securely under DNS name "https:\\agrihub.hef.tum.de" and limit access to this service to only MNW network. For that we installed the apache server inside the HEF-LRZ server /host machine with the config file:
<VirtualHost *:80> Servername agrihub.hef.tum.de Redirect / https://agrihub.hef.tum.de/ </VirtualHost> <VirtualHost *:443> ServerName agrihub.hef.tum.de ServerAdmin admin@hef.tum.de SSLEngine on SSLProtocol -all +TLSv1.2 SSLCertificateFile /etc/apache2/ssl_agrihub/ckanhef.pem SSLCertificateKeyFile /etc/apache2/ssl_agrihub/ckanhef.key SSLCertificateChainFile /etc/apache2/ssl_agrihub/cacerts_DNS.crt SSLCipherSuite ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-DSS-AES128-GCM-SHA256:kEDH+AESGCM:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA:ECDHE-ECDSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-DSS-AES128-SHA256:DHE-RSA-AES256-SHA256:DHE-DSS-AES256-SHA:DHE-RSA-AES256-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128:AES256:AES:DES-CBC3-SHA:HIGH:!aNULL:!eNULL:!EXPORT:!DES:!RC4:!MD5:!PSK SSLHonorCipherOrder on SSLCompression off <Proxy *> Order Deny,Allow Deny from all Allow from 129.187 10.157 10.152 10.162.246 2001:4ca0:2fff::/48 2001:4ca0:2fff:9:0:1::/96 2001:4ca0:2fff:9:0:2::/96 </Proxy> ProxyPass / http://localhost:5000/ ProxyPassReverse / http://localhost:5000/ #ProxyPreserveHost On #ProxyRequests Off </VirtualHost>
Currently, HEF AgriHUB is running with the following information:
Service | port | accessibility |
---|---|---|
Apache: /etc/apache2/sites-available/agrihub.hef.tum.de.conf | 5000 (reversed to 80 & 433) | Allow from 10.162.246 129.187 |
CKAN | 5000 | (limited only to local machine) |
datapusher | 8000 | (limited only to local machine) |
db | 5432 | (inside the docker network) |
solr | 8983 | (inside the docker network) |
redis | 6379 | (inside the docker network) |
In the docker-compose.yml file it is set in a way that after every system reboot or docker restart it also starts automatically by setting up:
restart: always
However, it is suggested to check HEF AgriHUB every week to make sure it is running as expected.
- Keine Stichwörter