Friday, July 28, 2006

Why PySTDF?

One of the reasons I'm releasing PySTDF is that I haven't seen anything quite like it out there. I have seen a few projects in this space, as well as plenty of commercial tools for working with STDF. But nothing quite like PySTDF -- so how is PySTDF different?

Stream-oriented

PySTDF was designed first and foremost to be an event-based parser. If you are familar with XML, this is a similar approach to SAX parsing. If you are not, no problem. The idea is that you set up actions to handle all the different record types. This has many advantages:
  • Very fast
  • Low memory overhead
  • Very flexible

Python

Python is used in scientific applications, such as biotechnology and physics. I think the reason for this is that scientists are more concerned with solving problems and playing with data -- Python's simple lanaguage doesn't get in the way, and performs well enough to get the job done.

Deals with STDF's warts

STDF is ubiquitous, and convenient as a standard datalog format -- but how standard is it? Or usable? In my experience I have struggled with many of STDF's warts:
  1. STDF has some really strange data types -- like variable length bit-fields, fields that specify the size of other fields, and other weirdness. Writing a parser for all these cases isn't fun.

  2. STDF is hard to use
    Semiconductor test engineers should be able to play with data without having to deal with the messiness of the STDF format. You were probably hoping to load that data into some kind of statistical analysis tool, right?

  3. Broken, dirty data
    STDF data is only as good as the ATE vendor's implementation of the format, and the identifiers used in the testing process. In my experience, there are many cases where the data needs to be repaired, cleaned or otherwise preprocessed. A stream-oriented parser is well-suited to solve many of these issues.

  4. C libraries
    Engineers need to be able to play with data, not wrastle with compilers, memory allocation, and pointers. Programming in C gets in the way of experimentation and rapid application development.
I would like to hear your experiences with the library and any suggestions for where to take it, how I can improve upon it, and applications you might like to see built on top of it. Also, I am interested in your experiences with STDF!

Announcing PySTDF

PySTDF is a little pet project of mine, a Python module for processing STDF. STDF (Standard Test Data Format) is a binary datalog format originally developed by Teradyne which is supported by most major automated test equipment (ATE) platforms. I have several years of experience with software development in a semiconductor test environment and as a result I am very familiar with this ubiquitous format.

When I was learning Python a while back I wrote a simple parser in my spare time to understand two concepts in "Pythonic" programming: functional programming and metaclasses. Out of this hacking around, and a little cleanup, comes PySTDF.

Visit the PySTDF project page.