Understand the basics

The first step in investigating digital asset management is to figure out the goal for the system. Ideally, the system minimizes the effort of storing, retrieving and using content while maximizing return on investment by enabling efficient repurposing of content.

No matter what the specific approach, there is one underlying ingredient to all digital asset-management systems: metadata. Metadata is a collection of all textual, numerical, graphical or low-resolution content that helps to identify, explain or modify the media file to which it is attached. Metadata may be as simple as a file name and format and as extensive as multiple-level information about how the media was created and who created it, detailed content descriptions, time-code-annotated key frames, and text logs—all identified at frame-level resolution.

A simple example of how it works is a composite frame from a motion picture, for which multiple elements are created separately and then melded together as a final scene. The elements include the base camera original for foreground action, the background matte, the special-effects elements, the dialogue audio, sound effects, and theatrical score. The digital asset-management system handles the composite shot as a temporal file, identifying the source, length, creator, date and time of creation, description, file format, and storage location of all of the disparate elements as metadata.

If a specific element or layer of the scene needs to be reworked, the metadata can be used to identify where and how the digital medium is stored, search for the content, and retrieve it for use. A producer can look at text logs, narrow the search to a series of key frames to identify the specific shot, and then look at strata of elements arranged by time code. Individual elements can then be viewed or listened to via streaming media at the producer's workstation.

The basics

A basic system for digital asset management is composed of an ingestion application, a storage system or systems, a database with search and retrieval mechanisms, and, in most cases, a rights-management application.

The ingestion application is critical. The most efficient manner of capturing metadata is to extract or create it while the parent medium is being ingested, or recorded, into a server or production system. This can be done by automated or manual methods in real time while the asset is being ingested from the videotape, satellite feed or live camera. It can also be extracted after assets are transferred into a system as digital files or during non–real-time processes, such as film scanning.

The ingestion client is typically a PC application that includes a method for controlling or accepting the source material. This can be as simple as a serial connection to a videotape recorder or server port and as involved as multiple network sessions with several digital repositories. Other applications may be controlled or invoked by the main ingestion application to aid in the extraction and creation of metadata. For example, once ingestion is slated to begin, the main application may invoke an application for identifying and extracting key frames or individual graphic thumbnails of specific frames, which are later used to identify the shot boundaries.

Another application may open a session with a low-resolution video server and create and stream a low-resolution video form of the incoming material. This will allow later users to peruse and view low-resolution proxies of the content without the cost of high-bandwidth networking and without spending time finding and retrieving the high-resolution media. This also allows several users to share content at once.

The storage systems depend on the design of the overall system and the work flow surrounding it. Storage systems are typically multiple high-capacity disk drives in RAID (Redundant Arrays of Inexpensive Disks) configurations based on the data rates required to sustain uninterrupted playback of video and audio. One system hosts the database and text information, while individual servers are employed to host each media format.

The low-resolution media servers are typically "off-the-shelf" hardware hosting multiple RAID drives and, in some cases, may include fully redundant hardware and disk storage. High-resolution media, usually broadcast-quality video and audio, is typically stored on dedicated hardware systems optimized for video quality and guaranteed bandwidth for playback to air. The playback system must be very fault-tolerant and comprise smaller servers with two to 14 dedicated ports. The dedicated port arrangement ensures that internal bandwidth and bandwidth to individual playback channels are sufficient to provide error-free playback performance.

Storage technology is on a continual march forward. New storage configurations are emerging that allow multiple file and media formats to be stored within a single storage system. And storage area networks, or SANs, provide large disk capacities that support a number of media formats and appear to users as one contiguous storage system. Within the SAN, different storage systems are distributed among computing devices and connected by a single data network. This offers data protection and redundancy through disk mirroring, automated backup and data retrieval, archiving, and the migration of data from one storage system to another.

Network attached storage, or NAS, is a bit different in that the disk storage is set up with its own network address and exists independently of any of the servers or computers in the system. This speeds access to and reading from the disks because applications and storage no longer compete for processor resources.

Perhaps the most critical element of the system is the database, which allows the entire asset collection to be searched for specific material based on any of its metadata. Object-relational databases perform best because they allow different media objects to be included in individual database records, complete with their class properties, essence and associated metadata.

The most effective object-relational databases allow media assets to be treated temporally. That is, the database understands the nature of a moving picture and allows different attributes of the file and its elements to be cataloged separately with a time-code base. In this manner, content can be cataloged, described, searched and retrieved at the frame level, allowing sufficient granularity to support production operations such as nonlinear editing.

The database's search engine is also a critical element, as this is the interface to the individual user. Most systems employ a browser-based search engine. This allows the application to be easily distributed to users. It also makes global changes to the user interface or database scheme a global operation. The search engine should provide different levels and complexities of searching, from a simple single-word text search, to more involved Boolean arguments that include full syntax text searches augmented by date and time ranges.

Rights management can be the Achilles' heel of any digital asset-management system. Assigning access rights and maintaining records of individual licenses can be difficult for media outlets that do not own the content outright. In some cases, asset managers have simply scanned legal documents and attached them as metadata, so a new decision may be rendered each time the content is requested for use. In the simplest case, various access levels can be engineered within the system to limit access to users based on their log-in. The system must support assignment of access rights to users or allow users to acquire licensing and then access the content.

Rights-management issues should not prevent building a system. But special considerations exist for each case, and the metadata and work flow of the system must be designed accordingly.