Open Standards Lab

A web tool for users and creators of open standards.

Getting started

Contents

Open Standards Lab Projects

A Standards Lab project’s provides a workspace to develop JSON schemas and test data with these schemas.

Project Settings

Owner

This is a read-only field which displays whether you are a project owner. Project owners have full privileges to change any settings, schema or data.

Modified

When the project’s settings were last modified.

Name

The current project name. Project names must only contain characters A-Z, a-z, 0-9 , - and _.

To create a copy of the project change the name and click ‘Save As New Project’. All settings, schema and data will be copied to the new project.

Top-Level key name for the list of the data

In data standards that support spreadsheets as a format the contents of the spreadsheet will need to be nested under a particular key name.

For example entering examples will mean that any spreadsheet data uploaded to the system will be converted to JSON data with a top-level key of examples.

Example resulting data:

{
    "examples": [
        {"object": 1},
        {"object": 0}
    ]
}

Further documentation can be found in the developing standards documentation.

Schema

Edit an existing JSON schema file by using the “Upload schema file” button and selecting the file for opening. Once uploaded the schema file will be available to be opened in the editor.

Edit a new file by setting the file name using the “File open” text entry. This is set to “schema.json” by default.

Use the “Save schema” button to save any changes.

Schema file management

The schema files in the project will appear in a list.

Each file can be opened in the editor by clicking on the file name.

Use the drop down menu to “Download” or “Delete” a file. If there are multiple schema files you will need to set which file is the root schema by using the menu item “Set as Root Schema”, the default is the first schema file uploaded.

Root Schema

The root schema in a project is the top-level schema which either contains the whole schema or references other schemas being used. For example:

{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "title": "An Example root schema",
  "type": "object",
  "required": [
    "examples"
  ],
  "properties": {
    "examples": {
      "type": "array",
      "minItems": 1,
      "items": {
        "$ref": "https://example.com/schema/example-items-schema.json"
      },
      "uniqueItems": true
    }
  }
}
Schema editor

The schema editor supports multiple views and modes for editing JSON. The default is set to “Tree”.

Changes in the editor are only saved once the “Save Schema” button is pressed.

Data

Edit a new file by setting the file name using the “File open” text entry. This is set to “untitled.json” by default.

Edit existing files use the Upload data button. Supported formats: Comma-Separated Values (.csv), JSON (.json), Microsoft Excel (.xlsx) or Open spreadsheet format (.ods).

Editing is supported for JSON files and CSV files.

Each file can be opened in the editor by clicking on the file name.

Data editor

The data editor supports multiple views and modes for editing JSON. The default is set to “Code”.

Changes in the editor are only saved once the “Save Data” button is pressed.

Data file management

Use the drop down menu to “Download” or “Delete” a file.

Test

To test the data in the project against the schema click “Start Test”. Once the test is complete a summary for each data file will be displayed. Full test results are available by clicking “View Result Details”.

Hosting on Dokku

Standards Lab is designed to be hosted on Dokku.

To set up an app, run these commands on the Dokku server:

# Set an app name
export APP_NAME = "standards-lab"
# Create the app
dokku apps:create $APP_NAME
# Set up the domain you want to use (you may need to use the dokku domains command here too)
dokku config:set $APP_NAME ALLOWED_HOSTS=xxxxx
# Create a Redis store, and link it to the app
dokku redis:create $APP_NAME
dokku redis:link $APP_NAME $APP_NAME
# Setup file storage
dokku storage:mount $APP_NAME /var/lib/dokku/data/storage/$APP_NAME/projects_dir:/projects_dir
# Configure the ports the webserver uses
dokku proxy:ports-add $APP_NAME http:80:80
# Set up the number of web servers and workers - change if needed; but at least 1 of each
dokku ps:scale $APP_NAME web=1 worker=1
# Set option - this is needed so that the about page can display what version of the software is deployed
dokku git:set $APP_NAME keep-git-dir true

Now deploy the repository to Dokku in the usual way (a git push or dokku git:sync command).

Optionally, add an SSL certificate by one of the usual ways. For example: https://github.com/dokku/dokku-letsencrypt

Push to the existing dokku app

These instructions are for people with deploy access to Open Data Services’s servers.

These instructions are for deploying to the live instance.

Add your ssh key to dokku if you haven’t already

scp the key to the server, and then on the server:

dokku ssh-keys:add name_for_your_key_here path/to/your_key.pub 

Set up your local git repo

On your local machine, inside a clone of this repository:

# Add the dokku remote
git remote add dokku-live dokku@dokku1.dokku.opendataservices.uk0.bigv.io:standards-lab-live

Do a deploy

On your local machine:

git push dokku-live main

Process Management

Standards Lab requires 3 main processes to be running, the webserver, the queue worker and the in-memory storage server.

Webserver

Using the Docker based deployment a webserver process is provided using gunicorn.

The webserver process runs the Django web framework and serves static files such as JavaScript and CSS.

Queue worker process

django-rq is used to start, manage and monitor the processes that test and process data.

The worker process uses django-rq to create processes when requested by the user. An example of a process is the CoVE process.

Restarting this process may cause any in-progress processes to be terminated. If this process is not running the Test functionality will not work.

This process requires a working Redis server.

In-memory storage server

Redis is used as the in-memory storage server.

Redis holds results from processes, is used by (django) RQ and as a cache for the webserver.

Restarting Redis may clear any in progress processes and results.

Project Files

Projects are stored in a directory that is specified by setting ROOT_PROJECTS_DIR in the django settings. The default is /tmp/standards-lab.

The value of ROOT_PROJECTS_DIR must be created before using Standards Lab.

The project directory includes: uploaded data, uploaded schema, and the project’s current settings.

Management of files

There is no automatic management of projects. Deleting a project’s directory will delete the project from Standards Lab.

Depending on your system it may be useful to use a cron job to periodically clear out projects that are no longer being used. This could be achieved using find and the mtime argument.

Security Considerations

Standards Lab allows users to upload arbitrary data and schema to the specified file system and run a process on that data.

Considerations

When thinking about the security implications, consider mitigations such as:

  • Adding HTTP authentication to restrict access to trusted users

  • Adding an SSL certificate to the server so that data is transmitted more securely

  • Using Standards Lab as a local application that is only available to the localhost

  • Isolating any deployment by using containers and virtualisation

  • Making sure users understand not to upload any data that may be private

System Resources

The system resources required depend on the number of users, the size of the schema, and the size of the data to be tested.

Typically large dataset requires more memory, and more than one concurrent user requires more CPU.

Coding Style

To create consistent and readable code please follow the guides:

Python

  • Before committing python code make sure to run python-black via $ black

  • Style:


variable_names = "things"

class ClassNames(object):
    pass

def function_names():
    return 1

HTML

  • Indent with 2 spaces

  • Style:

<div>
  <p>Hi</p>
</div>

JS

  • Indent with 2 spaces

  • variables and function names camelCase

  • Style:

function abcdAlpha(param){

}

let obj = {
  propertyA: 1,
  propertyB: 2
}

Contributing to Standards Lab

We welcome all contributions to Standards Lab. Authors should have the copyrights to any contributions and agree that the contribution is licensed under the same license as the Standards Lab project.

Before embarking on contributions we recommend opening an issue to discuss ideas and issues with the community.

Core components

Main

The main application is the django framework which loads the three django applications which make up standards lab.

  • ui

  • api

  • processor

The django framework is configured using:

  • settings/<settings file>

  • urls.py

  • wsgi.py

  • manage.py

UI

The UI application is further separated into two applications depending on responsibility.

The Django UI application is responsible for:

  • URL routing

  • Http Requests

  • Templates

  • Views

  • Common data templates’ context

The VueJS application is responsible for:

  • Interactivity within the web page

  • Rendering and two way bindings of in-page VueJS templates with data from the API

  • Sending and receiving data from the API

API

The API provides endpoints primarily for the VueJS application. All responses are JSON format.

/api/

  • project/<project-name>

    • GET: returns project configuration

    • POST: (JSON) updates or creates the project configuration (edit mode only)

  • project/<project-name>/upload

    • POST: Upload data to the project. (FormData) FILE required properties uploadType values "schema" or "data" for the different upload types.

  • project/<project-name>/download/<file-name>

    • GET: Returns project file. Optional property attach=true determines if the file should be sent to the browser as an attachment or as data. Default not present.

  • project/<project-name>/process

    • GET: Returns the status(es) and results of any Processor running for specified project

    • POST: (JSON) required properties action value "start", processName value "<name-of-processor>"

Processor

The processor is responsible for starting, defining and communicating with processing jobs. Each processor implements a start function and a monitor function.

When you run Standards Lab using docker-compose, the redis queue data is persisted in the _build/redis-data/ directory.

Utils

Utility functions that are common to all of the django applications.

Dependencies

We have a number of external dependencies for both the frontend and backend.

Updating python requirements

The python requirements are defined in requirements.in and requirements_dev.in these are used to generate requirements.txt and requirements_dev.txt using pip-compile.

Using pip-compile in docker

(Re)Generate the requirements files:

$ docker run --rm -v $(pwd):/code rhiaro/pip-tools <selected requirements.in file>

Upgrade the requirements:

$ docker run --rm -v $(pwd):/code rhiaro/pip-tools --upgrade <selected requirements.in file>

To install updated dependencies (TODO: use a shared volume to make this faster):

$ docker-compose down
$ docker-compose build
$ docker-compose up
Using pip-compile locally

(Re)Generate the requirements files:

$ pip-compile <selected requirements.in file>

Upgrade the requirements:

$ pip-compile --upgrade <selected requirements.in file>

Javascript dependencies

We currently use:

v-jsoneditor

This is VueJS wrapper for jsoneditor. To build v-jsoneditor:

# In the top level directory with the `package-lock.json`.
$ npm install

The minified javascript used in standards_lab/ui/static/v-jsoneditor/js/ is then output to ./node_modules/v-jsoneditor/dist/.

Testing

Code linting

We use a combination of flake8 and python-black to lint the python code.

Install the git hook

It is recommended to install the pre-commit git-hook to automatically check your code before a commit is made.

In the top level directory run:

$ ln -s ./pre-commit.sh ./.git/hooks/pre-commit

Running tests locally

Tests for the API, UI and processor are in their respective directories.

Locally

To run the tests in your local virtual environment:

$ cd standards_lab
$ python manage.py test
With Docker Compose

See the Docker page

Development and the docker environment

We use docker and dokku in production, and docker-compose can be used as a local development environment.

Start everything up with docker-compose up. This will show you logs from all of the running containers in the console. To run it in the background, use docker-compose up -d. Shut it down with docker-compose down.

To see the logs (eg. in another console, or if you’re running it in the background), run docker-compose logs. To see logs for a particular container, run docker-compose logs [containter], eg. docker-compose logs redis (with the service names from the docker-compose file). Use the docker-compose logs -f to continually show the logs as the service runs.

Updating the Dockerfile

If you make changes to either Dockerfile or docker-compose.yml you’ll need to rebuild it locally to test it:

$ docker-compose -f docker-compose.yml -f docker-compose.override.dev.yml down # (if running)
$ docker-compose -f docker-compose.yml -f docker-compose.override.dev.yml build --no-cache
$ docker-compose -f docker-compose.yml -f docker-compose.override.dev.yml up # (to restart)

Updating the code

You’ll need to rebuild the docker environment if you add, remove, or upgrade the dependencies.

If you edit Python code the changes should be reloaded automatically.

Running tests

To run the tests with docker-compose locally:

$ docker-compose  -f docker-compose.test.yml up

As before, you’ll need to rebuild the docker environment if you add, remove, or upgrade the dependencies:

$ docker-compose  -f docker-compose.test.yml down
$ docker-compose  -f docker-compose.test.yml build --no-cache
$ docker-compose  -f docker-compose.test.yml up