Menu
Search

Storing data in a serverless application

In my previous post, I discussed how to create a serverless application that keeps the API credentials secure. In this post, I’ll discuss storing data and files from a serverless application.

Serverless applications have no place to store persistent data or files. They don’t have a built-in database or permanent file system. However, applications can use the tmp folder for small transfers of data that aren’t persistent. This is a good fit if you have to generate a file for export via a web application. However, if you are creating a serverless application that requires permanent data or file storage, you must determine what technologies your application will use to support file or data storage. Fortunately, most cloud-provider platforms include tools for file and data storage.

File storage

If your application is deployed to AWS, then S3 is likely the best option for file storage. You can read or write data to S3 buckets via the SDK programming language of your choice. Below is an example of writing a CSV file of data to an S3 bucket.

// Create CSV string
let columns = {
    oclcnumber: "Original OCLC Number",
    newOCLCNum: "New OCLC Number"
};
let records = [{“oclcnumber”: , “newOCLCNumber”: },{{“oclcnumber”: , “newOCLCNumber”: }];
let csv_string = stringify(records, {header: true, columns: columns});                   
try {
    let result = await s3.putObject({Bucket: bucket, Key: dstKey, Body: csv_string}).promise();
    console.log('success')
    return { status: 'success' }
} catch (Error) {
    console.log(Error, Error.stack);
    return Error;
}

Other platforms, such as Google Cloud Platform, have comparable solutions. Alternatively, if your organization uses Dropbox or some other file-sharing solution with an API, this likely can be called from your application.

Log storage

Another piece of data which most applications need to store are application logs. When you develop a serverless application you have a couple options for how to do this. First, you can use an S3 file bucket to store log data. This is fairly similar to how logs are traditionally stored on an application server. However, this approach doesn't include tools for viewing or analyzing log data. A more robust approach is to use AWS CloudWatch to store application log data. Cloudwatch allows you to collect and storage logs, analyze and visualize log data and perform system monitoring.

Data storage

If the application you are creating needs to store data in a structured fashion, then it will likely need to interact with a cloud-based relational or No-SQL database. Amazon’s platform offers both types of tools. AWS DynamoDB is a cloud-based, No-SQL solution that allows you to store JSON documents in tables. AWS RDS is a cloud-based relation database tool capable of supporting a variety of database instances, such as PostgreSQL, MySQL, Microsoft SQL Server, and others. Both AWS DynamoDB and RDS can be used from AWS Lambda. Calls to DynamoDB are done via an API (typically via the AWS SDK). Most calls to the database hosted in RDS are done via a client library for that database, which establishes a database connection via a username and password. One exception to this is AWS Athena, which has its own API.

Below is a simple example in Node.js that adds a JSON document to the a DynamoDB table.

const aws = require('aws-sdk');
const uuidv1 = require('uuid/v1');
exports.handler = async (event, context) => {
    const dynamodb_doc = new aws.DynamoDB.DocumentClient({
        api_version: '2012-08-10',
        region: ‘us-east-1’
    });
    let collection = {
        TableName: ‘refDeskLog’,
        Item: {
            'transactionId': uuidv1(),
            'date': Date.now(),
            'question_type': ‘directional’
        }
    };
    try {
        let data = await dynamodb_doc.put(collection).promise()
    } catch (Error){
        console.log(Error, Error.stack);
        return Error;
    }

}

This code adds a simple JSON document to the refDeskLog table. The JSON contains an ID for the transaction, the date, and a “question type.”

When comparing options for file and data storage, it is important to note that tools accessible via the serverless platform your code runs on typically can be run by assigning permissions to the service role assigned to the Lambda executing the code. This offers the advantage of converging authorization management in a single place rather than spreading it out across API keys, username/password combinations, etc.

In this post, we’ve examined options for storing data used in serverless applications. In our next post, we’ll examine how Lambda code is triggered via “events.”

  • Karen Coombs

    Karen Coombs

    Senior Product Analyst