Accessing local files from a web browser using HTML5 File API

HTML5 introduced a standard interface called File API which lets programmers access meta-data and read contents of local files selected by a user. The selection is typically done using input element but the recent browsers also allow using drag and drop for this task. The number of opportunities this new API gives us is very wide and include at least: checking files before sending to server, showing thumbnails while uploading files and processing files without interaction with server.

In this article I would like to show how to use this API to read basic properties of a file chosen by a user and additionally read it and calculate its SHA-1 hash. The main window of the sample application will look like this:
html5fileapi

Because the File API is quite new, it is supported only by the latest versions of the web browsers. So if you still have an older one, I strongly suggest upgrading it.

Accessing selected files

The classic method to select a file from local disk is to add HTML input element with type file:

<input id="fileInput" type="file" />

If optional attribute multiple is specified, it is possible to select many files at once.

In JavaScript it is possible to access FileList object, representing files selected by a user, using files property of file input element. FileList behaves like an array so accessing individual File objects is easy:

var fileInput = $("#fileInput")[0];
var fileList = fileInput.files;
for (var i = 0; i < fileList.length; i++) {
    var file = fileList[i];
    // handle file
}

Of course, if user selected only one file (e.g. because multiple attribute was not specified), FileList object contains only one file at index 0.

Once we get a reference to File object representing file selected by a user, we can access one of the following attributes providing useful information about the file:

  • name – The name of the file without any path information
  • size – The size of the file in bytes
  • type – The MIME type of the file (e.g. text/plain) or empty string if it is not available
  • lastModifiedDate – The date of the last modification of the file

Reading file contents

Reading the contents of the file is a bit more complicated. First, we have to create FileReader object, assign listeners to one or more of its events and then call one of these asynchronous methods to start reading the file:

  • FileReader.readAsText(Blob or File, optional_encoding) – Starts reading a text file using given encoding. If the encoding is not specified, file contents is interpreted as UTF-8 text.
  • FileReader.readAsDataURL(Blob or File) – Starts reading a file into data URL string. Typically, this operation is used to read an image file and the result of thss operation can be assigned to src attribute of img element.
  • FileReader.readAsArrayBuffer(Blob or File) – Starts reading a binary file into ArrayBuffer object.

These three methods are asynchronous which means that they finish immediately but the reading operation continues in the background. Once the operation finishes successfully, onload event of FileReader is triggered and the read data can be accessed using result property of the FileReader object. The complete list of available events is below:

  • onloadstart – Triggered when reading operation is starting
  • onload – Triggered when reading operation is finished successfully
  • onloadend – Triggered when reading operation finished (either successfully or with an error)
  • onprogress – Triggered multiple times while reading. Can be used to monitor the progress.
  • onerror – Triggered when reading operation finished with an error
  • onabort – Triggered when reading operation is aborted

Example

Knowing the basics we can start writing sample application by creating index.html file:

<!DOCTYPE html>
<html>
    <head>
        <title>HTML5 File API</title>
        <meta charset="UTF-8" />
        <script type="text/javascript" src="js/libs/jquery/jquery.js"></script>
        <script type="text/javascript" src="js/libs/sha1.js"></script>
        <script type="text/javascript" src="js/process.js"></script>
        <link type="text/css" rel="stylesheet" href="css/styles.css" />
    </head>
    <body>
        <form id="fileForm" class="fileForm">
            <input id="fileInput" type="file" required="required" />
            <button id="processButton" type="submit">Process</button>
        </form>

        <div class="fileStatus">
            <progress id="processProgress" max="1" value="0"></progress>
            <p><span id="processError" class="error"></span></p>
            <table id="processResults">
                <thead>
                    <tr><th>Param</th><th>Value</th></tr>
                </thead>
                <tbody>
                    <tr><td>Name</td><td id="nameValue"></td></tr>
                    <tr><td>Size</td><td id="sizeValue"></td></tr>
                    <tr><td>Type</td><td id="typeValue"></td></tr>
                    <tr><td>Hash</td><td id="hashValue"></td></tr>
                </tbody>
            </table>
        </div>
    </body>
</html>

We use JQuery to simplify several DOM operations and a very nice JavaScript library to calculate SHA-1 hashes (created by T. Michael Keesey). We use a basic form to select single file from local disk and submit it for processing. Once the file is submitted, we are periodically informing the user about the progress of the operation using progress element and finally update the values in the table. Alternatively, if the reading fails, we show the error.

One important thing I did not mention yet is js/process.js file where lives logic of our application:

function calculateHash(fileContents) {
    return sha1.hash(fileContents);
}

function startProcessing(fileInput, onsuccess, onerror, onprogress) {
    var fileList = fileInput[0].files;
    var file = fileList[0];
    var results = {
        name: file.name,
        size: file.size,
        type: file.type,
        hash: ""
    };

    var fileReader = new FileReader();
    fileReader.onload = function(e) {
        results.hash = calculateHash(e.target.result);
        onsuccess(results);
    };
    fileReader.onerror = function(e) {
        onerror(e.target.error.name);
    };
    fileReader.onprogress = function(e) {
        onprogress(e.loaded, e.total);
    };
    fileReader.readAsArrayBuffer(file);
}

function setResults(name, size, type, hash) {
    var table = $("#processResults");
    table.find("#nameValue").text(name);
    table.find("#sizeValue").text(size);
    table.find("#typeValue").text(type);
    table.find("#hashValue").text(hash);
}

function clearResults() {
    $("#processProgress").val(0).show();
    $("#processError").hide();
    $("#processResults").hide();
    setResults("", "", "", "");
}

function populateResults(data) {
    $("#processProgress").val(1);
    $("#processResults").show();
    setResults(data.name, data.size, data.type, data.hash);
}

function populateError(msg) {
    $("#processProgress").hide();
    $("#processError").text("Failed to read file: " + msg);
    $("#processError").show();
}

function populateProgress(loaded, total) {
    $("#processProgress").val(loaded / total);
}

function initialize() {
    $("#fileForm").submit(function(e) {
        e.preventDefault();
        clearResults();
        startProcessing($("#fileInput"), populateResults, populateError, populateProgress);
    });
    clearResults();
}

$(document).ready(initialize);

The most interesting things happen in two first functions. Function calculateHash() calculates and returns SHA-1 hash based on already read contents of the file (passed in as ArrayBuffer object). This function is synchronous which means that it may take some time to finish. On my system it is about 2 seconds per 100MB of data which in practice is not a big issue. Should it become a problem, you may consider putting SHA-1 calculation in a separate thread using web workers.

Function startProcessing() is the place where all the reading happens. At first, we obtain an object named file representing a file selected by a user and create object named results holding the data to be returned after reading operation finishes successfully. Then we create FileReader, setup listeners for onload, onerror and onprogress events of FileReader, and start reading the file into ArrayBuffer. The call fileReader.readAsArrayBuffer() ends immediately.

When the reading operation finishes successfully, event onload is triggered and the first event listener is called. In this event listener we calculate the SHA-1 hash and invoke onsuccess callback. Please, note that e.target.result is actually the same as fileReader.results and holds the contents of the file. In case of an error, event onerror is triggered which in our application shows an error message to a user. Additionally, event onprogress is called several times while reading the file and is used to update the progress bar.

The rest of the functions are responsible for updating, showing and hiding elements on the web page. Function initialize() additionally registers event listener to be called when a user submits the form.

Conclusion

File API is nothing fancy but if used correctly it can significantly improve user experience by reducing the number of round-trips to server and providing better feedback to user (e.g. showing thumbnails of selected files or monitoring progress of file upload).

The article shows only a portion of what File API can do. If you are interested in knowing more, take a look at W3C Working Draft.

The source code for the example can be found at GitHub.

Advertisement

About Robert Piasecki

Husband and father, Java software developer, Linux and open-source fan.
This entry was posted in HTML, JavaScript and tagged , , . Bookmark the permalink.

1 Response to Accessing local files from a web browser using HTML5 File API

  1. Pingback: Getting started | HTML5 File API

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.