Table of Contents#
- Prerequisites
- Method 1: Import via AWS Management Console (Using S3)
- 2.1 Prepare Your JSON File
- 2.2 Upload JSON to Amazon S3
- 2.3 Create a DynamoDB Import Job
- 2.4 Monitor the Import Job
- Method 2: Import via AWS SDK (Programmatic Approach)
- 3.1 AWS SDK Setup
- 3.2 Python Example (Boto3)
- 3.3 Node.js Example (AWS SDK for JavaScript)
- Console vs. SDK: When to Use Which?
- Troubleshooting Common Issues
- Conclusion
- References
Prerequisites#
Before starting, ensure you have the following:
- AWS Account: With administrative access or permissions for DynamoDB, S3, and IAM (for console method).
- DynamoDB Table: An existing table with the correct schema (primary key and sort key, if applicable). Imports will fail if the table doesn’t exist or has mismatched keys.
- JSON File: Data in DynamoDB-compatible JSON format (see Section 2.1 for details).
- S3 Bucket (for console method): A bucket to store the JSON file (must be in the same AWS Region as your DynamoDB table).
- AWS SDK Setup (for SDK method): Install the AWS SDK for your preferred language (e.g.,
boto3for Python,aws-sdkfor Node.js) and configure credentials (viaaws configureor IAM roles).
Method 1: Import via AWS Management Console (Using S3)#
The DynamoDB console supports importing data directly from Amazon S3, making it ideal for large datasets (up to terabytes) with minimal coding. Here’s how:
2.1 Prepare Your JSON File#
DynamoDB requires JSON data in JSON Lines format (also called “newline-delimited JSON”), where each line is a standalone JSON object representing a DynamoDB item. Do not use a single JSON array (e.g., [{...}, {...}]).
Example: Correct JSON Lines Format#
{"id": {"S": "1"}, "name": {"S": "Alice"}, "age": {"N": "30"}}
{"id": {"S": "2"}, "name": {"S": "Bob"}, "age": {"N": "25"}}Example: Incorrect Format (Avoid This!)#
[
{"id": "1", "name": "Alice", "age": 30},
{"id": "2", "name": "Bob", "age": 25}
]- Why? DynamoDB imports parse one item per line. Arrays or multi-line JSON objects will cause errors.
- Data Types: Each attribute must specify its DynamoDB data type (e.g.,
{"S": "string"},{"N": "number"},{"BOOL": true}). See DynamoDB Data Types for details.
2.2 Upload JSON to Amazon S3#
- Go to the S3 Console.
- Select your bucket (or create a new one in the same Region as your DynamoDB table).
- Click Upload, then drag-and-drop your JSON file or select it from your local machine.
- Click Upload to confirm.
2.3 Create a DynamoDB Import Job#
- Go to the DynamoDB Console.
- In the left menu, select Tables, then choose your target table.
- Go to the Actions dropdown and select Import from S3.
- On the “Import from S3” page:
- S3 URL: Enter the path to your JSON file (e.g.,
s3://my-bucket/dynamodb-import/data.json). - Import mode: Choose “Overwrite” (replace existing items) or “Append” (add new items).
- IAM role: Select an existing role with permissions for
s3:GetObject(to read from S3) anddynamodb:ImportTable(to write to DynamoDB). If no role exists, create one using the AWS IAM console with these permissions.
- S3 URL: Enter the path to your JSON file (e.g.,
- Click Import to start the job.
2.4 Monitor the Import Job#
- Track progress in the DynamoDB console under Imports (left menu).
- Statuses:
IN_PROGRESS,COMPLETED,FAILED. - For failed jobs, check the Failure reason (e.g., invalid JSON, missing permissions, mismatched primary keys).
Method 2: Import via AWS SDK (Programmatic Approach)#
The AWS SDK is ideal for small-to-medium datasets, programmatic workflows, or integrating data imports into scripts. We’ll use Python (boto3) and Node.js (AWS SDK for JavaScript) as examples.
3.1 AWS SDK Setup#
Python (boto3)#
pip install boto3
aws configure # Enter your AWS access key, secret key, Region, and output formatNode.js#
npm install aws-sdk
# Configure credentials via environment variables or ~/.aws/credentials3.2 Python Example (Boto3)#
This example uses BatchWriteItem to import items from a JSON file. BatchWriteItem allows writing up to 25 items per request (16MB limit), making it efficient for small datasets. For larger data, add pagination or use DynamoDBMapper (a higher-level library).
Step 1: Prepare the JSON File#
Use the same JSON Lines format as in the console method (each line is a DynamoDB item).
Step 2: Python Script#
import boto3
import json
# Initialize DynamoDB client
dynamodb = boto3.resource('dynamodb', region_name='us-east-1')
table = dynamodb.Table('YourTableName')
def import_json_to_dynamodb(json_file_path):
with open(json_file_path, 'r') as f:
for line in f:
item = json.loads(line) # Parse each line as a DynamoDB item
# Use BatchWriteItem (for bulk imports) or put_item (for single items)
table.put_item(Item=item)
print(f"Added item: {item['id']}")
if __name__ == "__main__":
import_json_to_dynamodb('data.json')Notes:#
- Error Handling: Add retries for
ProvisionedThroughputExceededExceptionusingbotocore.retryhandler. - BatchWriteItem: For bulk imports, group items into batches of 25:
from boto3.dynamodb.types import TypeSerializer def batch_write(items): serializer = TypeSerializer() batch = [{'PutRequest': {'Item': serializer.serialize(item)}} for item in items] response = table.batch_writer().put_item(Item=item) # Simplified batch writer
3.3 Node.js Example (AWS SDK for JavaScript)#
This example uses batchWriteItem to import items.
const AWS = require('aws-sdk');
const fs = require('fs');
AWS.config.update({ region: 'us-east-1' });
const dynamodb = new AWS.DynamoDB.DocumentClient();
async function importJsonToDynamoDB(filePath) {
const items = fs.readFileSync(filePath, 'utf8').split('\n').filter(line => line);
for (const line of items) {
const item = JSON.parse(line);
const params = {
TableName: 'YourTableName',
Item: item
};
await dynamodb.put(params).promise();
console.log(`Added item: ${item.id}`);
}
}
importJsonToDynamoDB('data.json').catch(console.error);Console vs. SDK: When to Use Which?#
| Factor | Console (S3 Import) | AWS SDK |
|---|---|---|
| Dataset Size | Large (TBs) | Small-to-medium (GBs or less) |
| Coding Required | No | Yes (Python, Node.js, etc.) |
| Flexibility | Limited (point-and-click) | High (custom logic, retries, validation) |
| Dependencies | Requires S3 bucket | None (direct API calls) |
| Use Cases | One-time imports, large backups | Scripts, CI/CD pipelines, dynamic data |
Troubleshooting Common Issues#
- Invalid JSON Format: Ensure JSON uses JSON Lines (one item per line) and correct DynamoDB data types (e.g.,
{"S": "string"}). - S3 Permissions: The IAM role for console imports must allow
s3:GetObjecton the S3 bucket. - Throttling: SDK imports may hit
ProvisionedThroughputExceededException. Retry with exponential backoff or increase table throughput. - Mismatched Primary Keys: Imported items must match the table’s primary key schema (e.g., missing
idifidis the primary key).
Conclusion#
Importing JSON into DynamoDB is straightforward with either the AWS Console (for large, no-code imports via S3) or the AWS SDK (for programmatic control). Choose the console for simplicity and scale, and the SDK for flexibility and integration into scripts. Always validate your JSON format and test with a small dataset first!