Overview

The most popular way to get started is to use docker. This page is focused on getting a basic dev installation up. For production we recommend K8s, see Production.

Getting Ready

  1. Install Docker and docker-compose if not already installed.
  2. Setup or select your Storage service.

Select your Storage

By default, Diffgram sets up MinIO for development use.
Simply hit enter during the install process.

There are 3 other storage options, AWS, Azure, and GCP Storage Providers List. MinIO may also be installed and configured separately.

The installer will test your storage connection. If using a custom setup we recommend setting up your storage prior to running the installer.

Install

Call python install.py and follow the prompts! It handles configuring your environment file, and calls the needed docker commands.

In summary you can copy and paste this and follow the prompts

git clone https://github.com/diffgram/diffgram.git
cd diffgram
pip install -r requirements.txt
python install.py

View the Install

Go to http://localhost:8085

View the docker UI to see service status.

Screenshot Examples

1200

Install.py example

1391

Example of Docker up and running in Windows

3677

Example of up and and running in standard terminal

1831

Dev Install Considerations

The default dev docker configuration does not store data in a durable way. The database and storage may both be deleted with the image is deleted. It is not a production web server. See Production installation. For a more durable dev installation you may also configure a seperate database and storage setup.

Mental Prep

Depending on multiple factors your install time will vary. In the absolute best case if you have your storage and docker already setup, and there are no surprises, it is fast. If you need to setup new hardware, new storage, credentials etc. it will take more time.

Diffgram is a system designed for scale. This means there is a little bit more configuration. It is quite normal for the install and initial setup to be the most difficult part.

A few things that take time:

  • Storage configuration and permissions
  • Setting up hardware resources (e.g. if running on cloud)
  • Admin account and project setup
  • Setting up stuff within Diffgram itself (e.g. connections, schema etc)
  • Ecosystem, like SDK etc.

We are always looking to improve the install process and welcome ideas!

Debugging

See Debugging A Dev Install

Config

Please note that we provide the various config mechanisms e.g. docker-compose, helm chart, etc. with the intent that in most cases that config will work well out of the box.

If there are any changes required, then it is quite possible it will "break" the config. While we can do our best to help please keep that mind. For example if you use custom ingress rules, custom TLS, use some parts on docker-compose and some not etc. We welcome ideas to build into the default config templates.

If getting help please be sure to mention anything that is non-standard, and if possible provide the complete customized new config.

Self-Guided Enterprise Tour

Installation and initial setup of any new technology, including Diffgram, is sometimes the most challenging part. Depending on your needs please expect to run into at least a handful of issues. Usually we can turn around fixes to these type of config / expectations issues same-day or next-day please reach out on slack when you encounter them.

Install Video

Note: After video was filmed we added the pip install -r requirements.txt step. This is required to validate the connection during the installation process. See above for complete code.

Warnings

🚧

The docker compose dev setup is not for Production.

Because the Database may be deleted if the image is deleted! And it does not include a production server like Gunicorn. See Kubernetes setup for Production installation.

Stopped Containers

diffgram-db_migration
diffgram-createbuckets

Are expected to stop upon completion.
These processes set up the latest database state and create the default buckets, therefore they exit once complete.

Known Install Issues

You may need to restart services.

Docker Compose Known Issues

Windows Specific

📘

Requires Elevated Permissions

Use Run as Administrator. This is becuase of aws boto3 not diffgram itself.

Production Install

Production

We suggest starting with the dev install (this page). It's faster to get started and faster to edit your env setup. Once you feel comfortable with Diffgram locally, then you can start planning for production.

📘

Production Data Protection

Please give care during installation to consider hardware resources, configuration, and read the production installation guide above before starting. Misconfigured setups can lead to data loss.

Database

After the installation you will see a folder called postgres-data was created. This is where all the SQL data from the postgres database will be. Make sure to avoid deleting this folder, unless you want to completely delete the data inside diffgram.

Manual Install with Docker Compose.

Use our pre-written docker-compose file and paste it in your environment.

🚧

Additional keys may be required

See the settings.py for a complete list of env vars.

Also, make sure you have the .env file with the appropriate values for each of the environment variables. Here is the example file with a description of each one:

GCP_SERVICE_ACCOUNT_FILE_PATH=/path/to/your/gcp/service/account.json or /dev/null if not used
CLOUD_STORAGE_BUCKET=<GCP STORAGE BUCKET NAME| Only use if GCP is your selected storage provider>
ML__CLOUD_STORAGE_BUCKET=<GCP STORAGE BUCKET NAME FOR ML TASKS| Only use if GCP is your selected storage provider>
SAME_HOST=False <If running in containarized enviroment leave False>
DIFFGRAM_STATIC_STORAGE_PROVIDER=< one of 'gcp, 'aws', 'azure'>
USER_PASSWORDS_SECRET=<any secure secret for hashing >
SECRET_KEY=<any secure secret for hashing >
DIFFGRAM_AWS_ACCESS_KEY_ID=<aws access key ID| only relevant if using aws>
DIFFGRAM_AWS_ACCESS_KEY_SECRET=<aws access key secret| only relevant if using aws>
DIFFGRAM_S3_BUCKET_NAME=<aws s3 bucket name| only relevant if using aws>
ML__DIFFGRAM_S3_BUCKET_NAME=<aws s3 ml bucket name| only relevant if using aws>
DIFFGRAM_AZURE_CONNECTION_STRING=<azure connection string| only relevant if using azure>
DIFFGRAM_AZURE_CONTAINER_NAME=<azure container name| only relevant if using azure>
ML__DIFFGRAM_AZURE_CONTAINER_NAME=<azure container name| only relevant if using azure>
EMAIL_VALIDATION=False
WALRUS_SERVICE_URL_BASE=http://walrus:8080/

Once you have all the enviroment files setup correctly. Run:

docker-compose up
# Or for headless mode
docker-compose up -d

You should see al the containers spin up and be able to access the Web UI at http://localhost:8085

After the installation you will see a folder called postgres-data was created. This is where all the SQL data from the postgres database will be. Make sure to avoid deleting this folder, unless you want to completely delete the data inside diffgram.

Backend storage vs Connections

The "backend" storage that's set at installation time is different from "connections". The backend storage is where the system stores and read BLOB data. It is assumed to be singular and static. (Excluding ML bucket.). Where as the Connections is assumed to be many and easily editable.

In the case where you already have data in a designated bucket, then it should be used as a connection, and a new bucket created as the backend.

Contributor Install

See New Engineer Welcome

Bare metal installation

You can also spin up each of the services and use your own dispatcher.

Need help?

Join our Community.