Data Import & Export
Data import
Data to be imported needs to adhere to the Normalized Data Format (NDF). As of today, the conversion from any concrete data source (like MySQL, MongoDB or Firebase) to NDF must be performed manually. In the future, the Prisma CLI will support importing from these data sources directly.
Here is a general overview of the data import process:
+--------------+ +----------------+ +------------+
| +--------------+ | | | |
| | | | | | | |
| | SQL | | (1) transform | NDF | (2) chunked upload | Prisma |
| | MongoDB | | +--------------> | | +-------------------> | |
| | JSON | | | | | |
| | | | | | | |
+--------------+ | +----------------+ +------------+
+--------------+
As mentioned above, step 1 has to be performed manually. Step 2 can then be done by either using the raw import API or the prisma1 import
command from the CLI.
To view the current state of supported transformations in the CLI and submit a vote for the one you need, you can check out this GitHub issue.
When uploading files in NDF, you need to provide the import data split across three different kinds of files:
nodes
: Data for individual nodes (i.e. databases records)lists
: Data for a list of nodesrelations
: Data for related nodes
You can upload an unlimited number of files for each of these types, but it's recommended that each file should be at most 1 MB large. Otherwise you might run into timeouts.
Important considerations
General constraints
When importing data, the id
field of a node can be at most 25 characters long.
Idempotency
Note that import operations are not idempotent. This means running an import always adds data to your service. It never updates existing nodes. This means importing the same dataset multiple times will lead to undefined behaviour.
For example, importing a node with the same id
more than once will lead to undefined behaviour and likely break your service!
Data Validation
The import API does not perform any validation checks on the data to be imported. When using the CLI to import data, basic validation checks are executed.
Importing invalid data leads to undefined behaviour and might break your service! As the service maintainer, you are responsible to ensure the validity of the imported data.
Tip: A good way to ensure valid data when importing is inspecting the data of a previous export on a service with an identical datamodel.
Data import with the CLI
The Prisma CLI offers the prisma1 import
command. It accepts one option:
--data
(short:-d
): A file path to a directory containing the data to be imported (this can either be regular or a zipped directory)
Under the hood, the CLI uses the import API that's described in the next section. However, using the CLI provides some major benefits:
- uploading multiple files at once (rather than having to upload each file individually)
- leveraging the CLI's authentication mechanism (i.e. you don't need to manually send your authentication token)
- ability to pause and resume an ongoing import
- import from various data sources like MySQL, MongoDB or Firebase (not available yet)
Input format
When importing data using the CLI, the files containing the data in NDF need to be located in directories called after their type: nodes
, lists
and relations
.
NDF files are JSON files following a specific structure, so each file containing import data needs to end on .json
. When placed in their respective directories (nodes
, lists
or relations
), the .json
-files need to be numbered incrementally, starting with 1, e.g. 1.json
. The file name can be prepended with any number of zeros, e.g. 01.json
or 0000001.json
.
Example
Consider the following file structure defining a Prisma service:
.
├── data
│ ├── lists
│ │ ├── 0001.json
│ │ ├── 0002.json
│ │ └── 0003.json
│ ├── nodes
│ │ ├── 0001.json
│ │ └── 0002.json
│ └── relations
│ └── 0001.json
├── datamodel.prisma
└── prisma.yml
data
contains the files data to be imported. Further, all files ending on .json
are adhering to NDF. To import the data from these files, you can simply run the following command in the terminal:
prisma1 import --data data
Data import using the raw import API
The raw import API is exposed under the /import
path of your service's HTTP endpoint. For example:
http://localhost:4466/my-app/dev/import
https://eu1.prisma.sh/my-app/prod/import
One request can upload JSON data (in NDF) of at most 10 MB in size. Note that you need to provide your authentication token in the HTTP Authorization
header of the request!
Here is an example curl
command for uploading some JSON data (of NDF type nodes
):
curl 'http://localhost:4466/my-app/dev/import' -H 'Content-Type: application/json' -H 'Authorization: Bearer eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiJPbmxpbmUgSldUIEJ1aWxkZXIiLCJpYXQiOjE1MTM1OTQzMTEsImV4cCI6MTU0NTEzMDMxMSwiYXVkIjasd3d3LmV4YW1wbGUuY29tIiwic3ViIjoianJvY2tldEBleGFtcGxlLmNvbSIsIkdpdmVuTmFtZSI6IkpvaG5ueSIsIlN1cm5hbWUiOiJSb2NrZXQiLCJFbWFpbCI6Impyb2NrZXRAZXhhbXBsZS5jb20iLCJSb2xlIjpbIk1hbmFnZXIiLCJQcm9qZWN0IEFkbWluaXN0cmF0b3IiXX0.L7DwH7vIfTSmuwfxBI82D64DlgoLBLXOwR5iMjZ_7nI' -d '{"valueType":"nodes","values":[{"_typeName":"Model0","id":"0","a":"test","b":0,"createdAt":"2017-11-29 14:35:13"},{"_typeName":"Model1","id":"1","a":"test","b":1},{"_typeName":"Model2","id":"2","a":"test","b":2,"createdAt":"2017-11-29 14:35:13"},{"_typeName":"Model0","id":"3","a":"test","b":3},{"_typeName":"Model3","id":"4","a":"test","b":4,"createdAt":"2017-11-29 14:35:13","updatedAt":"2017-11-29 14:35:13"},{"_typeName":"Model3","id":"5","a":"test","b":5},{"_typeName":"Model3","id":"6","a":"test","b":6},{"_typeName":"Model4","id":"7"},{"_typeName":"Model4","id":"8","string":"test","int":4,"boolean":true,"dateTime":"1015-11-29 14:35:13","float":13.333,"createdAt":"2017-11-29 14:35:13","updatedAt":"2017-11-29 14:35:13"},{"_typeName":"Model5","id":"9","string":"test","int":4,"boolean":true,"dateTime":"1015-11-29 14:35:13","float":13.333,"createdAt":"2017-11-29 14:35:13","updatedAt":"2017-11-29 14:35:13"}]}' -sSv
The generic version for curl
(using placeholders) would look as follows:
curl '__SERVICE_ENDPOINT__/import' -H 'Content-Type: application/json' -H 'Authorization: Bearer __JWT_AUTH_TOKEN__' -d '{"valueType":"__NDF_TYPE__","values": __DATA__ }' -sSv
Data export
Exporting data can be done either using the CLI or the raw export API. In both cases, the downloaded data is formatted in JSON and adheres to the Normalized Data Format (NDF). As the exported data is in NDF, it can directly be imported into a service with an identical schema. This can be useful when test data is needed for a service, e.g. in a dev
stage.
Data export with the CLI
The Prisma CLI offers the prisma1 export
command. It accepts one option:
--export-path
(short:-e
): A file path to a .zip-directory which will be created by the CLI and where the exported data is stored
Under the hood, the CLI uses the export API that's described in the next section. However, using the CLI provides some major benefits:
- leveraging the CLI's authentication mechanism (i.e. you don't need to manually send your authentication token)
- writing downloaded data directly to file system
- cursor management in case multiple requests are needed to export all application data (when doing this manually you need to send multiple requests and adjust the cursor upon each)
Output format
The data is exported in NDF and will be placed in three directories that are named after the different NDF types: nodes
, lists
and relations
.
Data export using the raw export API
The raw export API is exposed under the /export
path of your service's HTTP endpoint. For example:
http://localhost:4466/my-app/dev/export
https://database.prisma.sh/my-app/prod/export
One request can download JSON data (in NDF) of at most 10 MB in size. Note that you need to provide your authentication token in the HTTP Authorization
header of the request!
The endpoint expects a POST request where the body contains JSON with the following contents:
{
"fileType": "nodes",
"cursor": {
"table": 0,
"row": 0,
"field": 0,
"array": 0
}
}
The values in cursor
describe the offsets in the database from where on data should be exported. Note that each response for an export request will return a new cursor with either of two states:
- Terminated (not full): If all the values for
table
,row
,field
andarray
are returned as-1
it means the export has completed. - Non-terminated (_full): If any of the values for
table
,row
,field
orarray
is different from-1
, it means the maximum size of 10 MB for this response has been reached. If this happens, you can use the returnedcursor
values as the input for your next export request.
Example
Here is an example curl
command for uploading some JSON data (of NDF type nodes
):
curl 'http://localhost:4466/my-app/dev/export' -H 'Content-Type: application/json' -H 'Authorization: Bearer eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiJPbmxpbmUgSldUIEJ1aWxkZXIiLCJpYXQiOjE1MTM1OTQzMTEsImV4cCI6MTU0NTEzMDMxMSwiYXVkIjasd3d3LmV4YW1wbGUuY29tIiwic3ViIjoianJvY2tldEBleGFtcGxlLmNvbSIsIkdpdmVuTmFtZSI6IkpvaG5ueSIsIlN1cm5hbWUiOiJSb2NrZXQiLCJFbWFpbCI6Impyb2NrZXRAZXhhbXBsZS5jb20iLCJSb2xlIjpbIk1hbmFnZXIiLCJQcm9qZWN0IEFkbWluaXN0cmF0b3IiXX0.L7DwH7vIfTSmuwfxBI82D64DlgoLBLXOwR5iMjZ_7nI' -d '{"fileType":"nodes","cursor":{"table":0,"row":0,"field":0,"array":0}}' -sSv
The generic version for curl
(using placeholders) would look as follows:
curl '__SERVICE_ENDPOINT__/export' -H 'Content-Type: application/json' -H 'Authorization: Bearer __JWT_AUTH_TOKEN__' -d '{"fileType":"__NDF_TYPE__","cursor": {"table":__TABLE__,"row":__ROW__,"field":__FIELD__,"array":__ARRAY__}} }' -sSv
Normalized Data Format
The Normalized Data Format (NDF) is used as an intermediate data format for import and export in Prisma services. NDF describes a specific structure for JSON.
NDF value types
When using the NDF, data is split across three different "value types":
- Nodes: Contains data for the scalar fields of nodes
- Lists: Contains data for list fields of nodes
- Relations: Contains data to connect two nodes via a relation by their relation fields
Structure
The structure for a JSON document in NDF is an object with the following two keys:
valueType
: Indicates the value type of the data in the document (this can be either"nodes"
,"lists"
or"relations"
)values
: Contains the actual data (adhering to the value type) as an array
The examples in the following are based on this datamodel:
type User {
id: String! @unique
firstName: String!
lastName: String!
hobbies: [String!]!
partner: User
}
Nodes
In case the valueType
is "nodes"
, the structure for the objects inside the values
array is as follows:
{
"valueType": "nodes",
"values": [
{ "_typeName": STRING, "id": STRING, "<scalarField1>": ANY, "<scalarField2>": ANY, ..., "<scalarFieldN>": ANY },
...
]
}
The notations expresses that the fields _typeName
and id
are of type string. _typeName
refers to the name of the SDL type from your datamodel. The <scalarFieldX>
-placeholders will be the names of the scalar fields of that SDL type.
For example, the following JSON document can be used to import the scalar values for two User
nodes:
{
"valueType": "nodes",
"values": [
{
"_typeName": "User",
"id": "johndoe",
"firstName": "John",
"lastName": "Doe"
},
{
"_typeName": "User",
"id": "sarahdoe",
"firstName": "Sarah",
"lastName": "Doe"
}
]
}
Lists
In case the valueType
is "lists"
, the structure for the objects inside the values
array is as follows:
{
"valueType": "lists",
"values": [
{ "_typeName": STRING, "id": STRING, "<scalarListField>": [ANY] },
...
]
}
The notations expresses that the fields _typeName
and id
are of type string. _typeName
refers to the name of the SDL type from your datamodel. The <scalarListField>
-placeholder is the name of the of the list fields of that SDL type. Note that in contrast to the scalar list fields, each object can only values only for one field.
For example, the following JSON document can be used to import the values for the hobbies
list field of two User
nodes:
{
"valueType": "lists",
"values": [
{ "_typeName": "User", "id": "johndoe", "hobbies": ["Fishing", "Cooking"] },
{ "_typeName": "User", "id": "sarahdoe", "hobbies": ["Biking", "Coding"] }
]
}
Relations
In case the valueType
is "relations"
, the structure for the objects inside the values
array is as follows:
{
"valueType": "relations",
"values": [
[
{ "_typeName": STRING, "id": STRING, "fieldName": STRING },
{ "_typeName": STRING, "id": STRING, "fieldName": STRING }
],
...
]
}
The notations expresses that the fields _typeName
, id
and fieldName
are of type string.
_typeName
refers to a name of an SDL type from your datamodel. The <relationField>
-placeholder is the name of the of the relation field of that SDL type. Since the goal of the relation data is to connect two nodes via a relation, each element inside the values
array by itself is a pair (written as an array which always contains exactly two elements) rather than a single object as was the case for "nodes"
and "lists"
.
For example, the following JSON document can be used to create a relation between two User
nodes via the partner
relation field:
{
"valueType": "relations",
"values": [
[
{ "_typeName": "User", "id": "johndoe", "fieldName": "partner" },
{ "_typeName": "User", "id": "sarahdoe", "fieldName": "partner" }
]
]
}