HomeDigital EditionSearch Dotnet Cd
ASP.NET C# Certification Exams The CLI Data Access Editorials Extending .NET Fundamentals Interoperability Interviews Migrate Mobile .NET Mono .NET Interface Object-Oriented Programming Open Source Optimization Product/Book Reviews Security Source Code UML Visual Studio .NET

Enter the Data Definition Language
A developer perspective

Most developers have faced situations in which they need to read data from a file of some given format but can find no real library routines for this purpose, other than the standard file-handling libraries. In such cases, they would have to develop code to calculate the addresses of various values they want to retrieve from the file, plus write file-handling code to retrieve the data. While this is probably feasible for simple file types, the need to support a complex file type, multiple file types, or a changing file type can easily become a development bottleneck.

Our new language for the .NET platform, the Data Definition Language (DDL), is a unique language that aims to solve this common problem. We hope that by the time you have finished reading this article, you will want to download the DDL and try it for yourself (and drop us mail about it)

At one time we worked with a development team that was creating an application to analyze data collected in black boxes from aircraft. Data dumps from black boxes have very complex formats. The team needed to read values of various aircraft parameters such as left aileron angle, right engine exhaust temperature, etc., which were scattered across various bits that seemed to have no respect for word boundaries or any such thing. To add to the confusion there was no apparent standard shared between black box formats of the same manufacturer, versions of the same aircraft, or any such thing.

The team needed their product to support multiple black box formats without a huge development cycle between them. A little investigation made it apparent that there was no ready solution for supporting data retrieval from arbitrary file formats. This is the problem that the DDL was designed to solve.

How does the DDL understand the format of the file you want to read data from? The DDL is a language designed to be simple and intuitive for expressing data formats. To develop a solution with the DDL, a developer needs to first write a DDL script that represents the file format. Once the script is developed, you can run the DDL engine/interpreter on the script, provide the engine with your data file, and you are ready to start reading information from the file.

So how can you use the DDL in developing an application? The DDL engine is designed to be an interpreter that can be hosted by any .NET application – which means that your VB.NET or C# programs can act as hosts for the DDL. A typical DDL solution (see Figure 1) would consist of a parent application that contains the user interface and the business logic and that hosts the DDL engine, with which you can programmatically interact. The DDL engine simply acts as a substitute for your complex filehandling code; it does not dictate what you do with the data that you have retrieved.


Figure 1

Once you have hosted the DDL engine, you can tell it to do tasks such as "load this DDL script file", to make it understand your file format. Then you can tell it to "use this data file" so that it can apply your DDL script (which defines the data format) of the actual data file. Finally, you tell it "give me the value of left aileron angle" and the DDL engine looks at the script, understands where in the data file "left aileron angle" would exist, and reads out the value for your host application.

What Are DDL Script Files Like?
The DDL language provides you with primitive types that represent bits; these range from single-bit definitions to 32-bit definitions. They are named i1, i2, i3, etc. (to i32). Variables can be declared to represent any of these types.

i16 width

This represents a 16-bit value called width. Declarations such as this should be grouped into DDL structures similar to the following:

struct Window
{
i16 width
i16 height
}

A DDL script file can contain multiple structures like the one above. Similar to the declaration of primitive bit types you could also define a structure to contain a member of another structure type.

struct Bitmap
{
i8 BitsPerPixel
Window w
}

The DDL script requires that you mark one structure in the script file as the "init" structure. The DDL uses the init structure as the first structure of your data file; the init structure is mapped to offset zero of your data file. The init structure and its members (which may be instances of some of the other structures declared in the script file) are expected to represent the entire data file.


Figure 2

The DDL structures are different in concept from structures in languages such as C. The first difference is that a DDL structure can contain members depending on conditions. Second, a member in a DDL structure simply represents a region in the data file, e.g., an i8- type member would represent a byte-sized region. Members in a structure can have their offset addresses from the base address of the structure automatically calculated or can have explicit addresses provided. The following script demonstrates both of these.

struct EmployeeData
{
i32 empNumber
i1 fPhoneNumberProvided
i7 fUndefined
when( fPhoneNumberProvided == 1 )
{
i8[10] phoneNumberString
};
@ 0,0 i24 empSerialNumber
i8 empDesigCode
}

This simple script demonstrates some interesting things. An instance of the EmployeeData structure will contain a member called phone NumberString of 10 bytes, only if the fPhoneNumberProvided flag bit is set. Similarly the notation "@ 0,0" is an explicit address specification that causes the address of the declaration immediately following it to fall at an offset of 0 bits from the start of the structure. Thus, with respect to the little-endian Intel architecture, the lower 3 bytes of empNumber are the empSerialNumber and the most significant byte stands for the employee's designation code.


Figure 3

The script also demonstrates a simple array of i8 (or byte) type of 10 elements. Unlike with C, the size of an array can be specified via an expression. To illustrate the power of arrays, consider the following snippet, which uses the employee structure shown earlier.

struct EmployeeHeader : init
{
i16 empCount
EmployeeData [empCount] emps
}

The first member decides the number of employees whose data is provided and "emps" represents an array of EmployeeData types. You can now ask the DDL engine for the designation code of the 5th employee, and the DDL engine sets about determining the location of the 5th employee and retrieving its "empDesigCode" value for you. Remember that the EmployeeData had a member that would occur conditionally; this means that the size of each EmployeeData instance can be different. The DDL engine internally determines that there is a dependency, and checks the flag in each of the preceding EmployeeData instances to determine the actual location of the 5th instance.

The DDL script provides only a minimal set of programming constructs, whose purpose is centered on being able to define data formats. The current version of the DDL is rich enough to support a wide variety of common file formats. There may, however, be some format types that may be difficult or impossible to express in the DDL.

The DDL has constructs for representing address specs, size specs, conditional dependencies, different kinds of array constructs, etc. This is just a brief description of the language; the complete language description document is available from the home page.

Hosting the DDL Interpreter
This is probably the simplest part. Imagine that EmployeeHeader and EmployeeData together formed a DDL file called employee.ddl and that we had a data file in this format called empinfo.bin. The C# snippet shown in Listing 1 is all you will need to start using the DDL.

Listing 1 loads the DDL with the script file and a data file and reads values from it but does not show any of the error-handling code that would be required in a productionquality application. One concept you need to be familiar with to use the DDL is that of path.

The init structure is represented as "." (dot). Any member under it is represented using its member name. Any child of that member is separated by a dot, and so on. If there is an array in the path, then the array instance is separated by a ":" (colon), for example:

init.emp[0] will be represented as ".emps:0"

At any point the GetValue() call will return the values on any of the variables in the current path. The Seek() method is used to set the current path to another location; subsequent GetValue() calls will read values from that location. Bit values that are read are treated as unsigned integer types. All values are returned as "double" types.

A document describing the API exposed by the DDL is available on the home page for details.

The DDL engine is currently available for download as System. DDL.dll. This is a mixed-mode .NET assembly developed in Managed C++ and can be hosted in any .NET application. The entire source code of the DDL is also available for download. The DDL is offered for use free of cost and is currently not under any licensing or royalty restrictions. We, however, expect that in return we will get feedback that can help us improve the DDL. If the DDL is used for commercial purposes, we hope that the authors will drop us a note and possibly give credit where credit is due; this would help in popularizing the DDL. You are, however, not required to do any of these and are free to use the DDL without any acknowledgment at all.

A console program is also available that can be used to test run DDL scripts you are developing. It is also available in source form as an example of hosting the DDL. The site also offers tutorial material about the DDL console, as well as documentation about the language, API, internal algorithms, and such.

Present and Future
The DDL in its .NET avatar is currently in Beta 1 status and is the work of two people. We believe that the idea of a generally useful DDL system has substance and are hoping to work toward it.

For future development we are hoping to strengthen the DDL language so that it can be used to express data formats that are currently difficult or impossible to express. Plans are also under way for a DDL compiler. The compiler will take a .ddl file as input and generate a .NET assembly as output that will be code streamlined for your DDL script, rather than a general purpose DDL interpreter.

We are hoping to build a community around the DDL and would like to invite you to join the DDL development project. Input for future design aspects, known issues, etc., would be appreciated.

Resource

  • The DDL project home page: http://ddl.sscli.net

    Author Bios
    Pooja Malpani is one of the youngest Microsoft MVPs in India for .NET. She has been working on .NET for three years and gives talks/seminars on .NET and Web services at user groups/forums and universities. Her interests include algorithm design, language theory, and programming in general. Pooja is currently working with the .NET team of Cognizant Technology Solutions, Bangalore, as a programmer analyst. sdolly@sscli.net

    Roshan James graduated from Model Engineering College in 2002 and is one of the youngest Microsoft MVPs in India. He works for Cognizant Technology Solutions' Microsoft Tech. Group at Bangalore. A math and physics buff who turned quasi-geek with a DOS box 7 years back, he likes exploring operating systems, languages, compilers, and runtimes.
    spark@sscli.net

    
    
    
    Listing 1
    
    using System;
    using System.DDL;
    class DDLTestClass
    {
    static void Main()
    {
    //initialize the DDL
    ManagedDDLEngine ddl = new ManagedDDLEngine();
    ddl.LoadSourceFile("employee.ddl");
    ddl.OpenDataFile("empinfo.bin");
    ddl.InterpretData(); //this call is needed to map
    //the source to the data
    Console.WriteLine("No of employees = {0}",
    ddl.GetValue("empCount")); // GetValue() read the
    //value of a variable
    //contents of emps[0]
    ddl.Seek(".emps:0"); //seek changes path
    //into a member
    Console.WriteLine("Data of 0th Employee \n\t "+
    "empNumber={0}",
    ddl.GetValue("empNumber"));
    //contents of emps[5]
    ddl.Seek(".emps:5");
    Console.WriteLine("Data of 5th Employee \n\t "+
    "empNumber={0} \n\t "+
    "empDesigCode={1}, \n\t "+
    "empSerialNumber={2}",
    ddl.GetValue("empNumber"),
    ddl.GetValue("empDesigCode"),
    ddl.GetValue("empSerialNumber"));
    ddl.Dispose(); //clean up
    }
    }
    

    All Rights Reserved
    Copyright ©  2004 SYS-CON Media, Inc.

      E-mail: info@sys-con.com